[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419091#comment-15419091
 ] 

Murtadha Hubail edited comment on ASTERIXDB-1451 at 8/19/16 5:23 PM:
---------------------------------------------------------------------


I have been investigating this issue for some time. It is not related to the 
VBC page size nor upsert. When any record, with an enforced index of a type 
other than INT64, is written to a disk component, a delete operation with an 
antimatter tuple for that same record will not hide the record written in the 
disk component of the enforced index. Therefore, duplicate records will be 
returned in any query using the enforced index.

I haven't found the root cause yet, but I believe it is related to the variable 
type propagation during query compilation. I will discuss it with [~buyingyi] 
and [~alamoudi].

The issue can be easily reproduced by the following:

{code}
drop dataverse test if exists;
create dataverse test;
use dataverse test;

create type OrderOpenType as open {
  o_orderkey: int64
};

create dataset OrdersOpen(OrderOpenType)
primary key o_orderkey;

insert into dataset OrdersOpen (
  {"o_orderkey": 1, "o_custkey": 1}
);

create index idx_Orders_Custkey on OrdersOpen(o_custkey:int32) enforced;

upsert into dataset OrdersOpen (
  {"o_orderkey": 1, "o_custkey": 2} );

for $o in dataset('OrdersOpen')
where
 $o.o_custkey >=-1
return {
  "o_orderkey": $o.o_orderkey
};
{code}


was (Author: mhubail):


I have been investigating this issue for some time. It is not related to the 
VBC page size nor upsert. When any record, with an enforced index of a type 
other than INT64, is written to a disk component, a delete operation with an 
antimatter tuple for that same record will not hide the record written in the 
disk component of the enforced index. Therefore, duplicate records will be 
returned in any query using the enforced index.

I haven't found the root cause yet, but I believe it is related to the variable 
type propagation during query compilation. I will discuss it with [~buyingyi] 
and [~alamoudi].

The issue can be easily reproduced by the following:

{code}
drop dataverse test if exists;
create dataverse test;
use dataverse test;

create type OrderOpenType as open {
  o_orderkey: int64
}

create dataset OrdersOpen(OrderOpenType)
primary key o_orderkey;

insert into dataset OrdersOpen (
  {"o_orderkey": 1,
  "o_custkey": 1,}
)

create index idx_Orders_Custkey on OrdersOpen(o_custkey:int32) enforced;

for $o in dataset('OrdersOpen')
where
 $o.o_custkey >=-1
return {
  "o_orderkey": $o.o_orderkey
}
{code}

> Upsert: Open Index test fails with duplicate rows in result when VBC page 
> size is reduced
> -----------------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1451
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1451
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Michael Blow
>            Assignee: Murtadha Hubail
>
> To repro:
> - configure storage.memorycomponent.pagesize to 8k, and increase 
> storage.memorycomponent.numpages to 24
> - run asterix-app runtime tests
> - observe failure in open-index test with duplicated rows as shown below.
> $ diff -du 
> src/test/resources/runtimets/results/upsert/open-index/open-index.1.adm 
> rttest/results/upsert/open-index.adm
> --- src/test/resources/runtimets/results/upsert/open-index/open-index.1.adm   
> 2016-04-27 20:40:58.000000000 -0700
> +++ rttest/results/upsert/open-index.adm      2016-05-16 19:08:57.000000000 
> -0700
> @@ -1099,11 +1099,15 @@
>  { "o_orderkey": 5927, "o_custkey": 116 }
>  { "o_orderkey": 5952, "o_custkey": 148 }
>  { "o_orderkey": 5955, "o_custkey": 94 }
> +{ "o_orderkey": 5955, "o_custkey": 94 }
> +{ "o_orderkey": 5957, "o_custkey": 89 }
>  { "o_orderkey": 5957, "o_custkey": 89 }
>  { "o_orderkey": 5958, "o_custkey": 115 }
>  { "o_orderkey": 5984, "o_custkey": 70 }
>  { "o_orderkey": 5985, "o_custkey": 143 }
>  { "o_orderkey": 5986, "o_custkey": 115 }
> +{ "o_orderkey": 5986, "o_custkey": 115 }
> +{ "o_orderkey": 5987, "o_custkey": 64 }
>  { "o_orderkey": 5987, "o_custkey": 64 }
>  { "o_orderkey": 10986, "o_custkey": 115 }
>  { "o_orderkey": 10987, "o_custkey": 64 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to