[
https://issues.apache.org/jira/browse/ASTERIXDB-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419091#comment-15419091
]
Murtadha Hubail edited comment on ASTERIXDB-1451 at 8/19/16 5:23 PM:
---------------------------------------------------------------------
I have been investigating this issue for some time. It is not related to the
VBC page size nor upsert. When any record, with an enforced index of a type
other than INT64, is written to a disk component, a delete operation with an
antimatter tuple for that same record will not hide the record written in the
disk component of the enforced index. Therefore, duplicate records will be
returned in any query using the enforced index.
I haven't found the root cause yet, but I believe it is related to the variable
type propagation during query compilation. I will discuss it with [~buyingyi]
and [~alamoudi].
The issue can be easily reproduced by the following:
{code}
drop dataverse test if exists;
create dataverse test;
use dataverse test;
create type OrderOpenType as open {
o_orderkey: int64
};
create dataset OrdersOpen(OrderOpenType)
primary key o_orderkey;
insert into dataset OrdersOpen (
{"o_orderkey": 1, "o_custkey": 1}
);
create index idx_Orders_Custkey on OrdersOpen(o_custkey:int32) enforced;
upsert into dataset OrdersOpen (
{"o_orderkey": 1, "o_custkey": 2} );
for $o in dataset('OrdersOpen')
where
$o.o_custkey >=-1
return {
"o_orderkey": $o.o_orderkey
};
{code}
was (Author: mhubail):
I have been investigating this issue for some time. It is not related to the
VBC page size nor upsert. When any record, with an enforced index of a type
other than INT64, is written to a disk component, a delete operation with an
antimatter tuple for that same record will not hide the record written in the
disk component of the enforced index. Therefore, duplicate records will be
returned in any query using the enforced index.
I haven't found the root cause yet, but I believe it is related to the variable
type propagation during query compilation. I will discuss it with [~buyingyi]
and [~alamoudi].
The issue can be easily reproduced by the following:
{code}
drop dataverse test if exists;
create dataverse test;
use dataverse test;
create type OrderOpenType as open {
o_orderkey: int64
}
create dataset OrdersOpen(OrderOpenType)
primary key o_orderkey;
insert into dataset OrdersOpen (
{"o_orderkey": 1,
"o_custkey": 1,}
)
create index idx_Orders_Custkey on OrdersOpen(o_custkey:int32) enforced;
for $o in dataset('OrdersOpen')
where
$o.o_custkey >=-1
return {
"o_orderkey": $o.o_orderkey
}
{code}
> Upsert: Open Index test fails with duplicate rows in result when VBC page
> size is reduced
> -----------------------------------------------------------------------------------------
>
> Key: ASTERIXDB-1451
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1451
> Project: Apache AsterixDB
> Issue Type: Bug
> Reporter: Michael Blow
> Assignee: Murtadha Hubail
>
> To repro:
> - configure storage.memorycomponent.pagesize to 8k, and increase
> storage.memorycomponent.numpages to 24
> - run asterix-app runtime tests
> - observe failure in open-index test with duplicated rows as shown below.
> $ diff -du
> src/test/resources/runtimets/results/upsert/open-index/open-index.1.adm
> rttest/results/upsert/open-index.adm
> --- src/test/resources/runtimets/results/upsert/open-index/open-index.1.adm
> 2016-04-27 20:40:58.000000000 -0700
> +++ rttest/results/upsert/open-index.adm 2016-05-16 19:08:57.000000000
> -0700
> @@ -1099,11 +1099,15 @@
> { "o_orderkey": 5927, "o_custkey": 116 }
> { "o_orderkey": 5952, "o_custkey": 148 }
> { "o_orderkey": 5955, "o_custkey": 94 }
> +{ "o_orderkey": 5955, "o_custkey": 94 }
> +{ "o_orderkey": 5957, "o_custkey": 89 }
> { "o_orderkey": 5957, "o_custkey": 89 }
> { "o_orderkey": 5958, "o_custkey": 115 }
> { "o_orderkey": 5984, "o_custkey": 70 }
> { "o_orderkey": 5985, "o_custkey": 143 }
> { "o_orderkey": 5986, "o_custkey": 115 }
> +{ "o_orderkey": 5986, "o_custkey": 115 }
> +{ "o_orderkey": 5987, "o_custkey": 64 }
> { "o_orderkey": 5987, "o_custkey": 64 }
> { "o_orderkey": 10986, "o_custkey": 115 }
> { "o_orderkey": 10987, "o_custkey": 64 }
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)