[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfeng Jia updated ASTERIXDB-1905:
------------------------------------
    Description: 
It seems we are not passing the correct(or enough) information to the secondary 
index BulkloadOperator when it is created afterward if the data has been 
ingested already. During the bulkload, only the secondary keys are sent to the 
index, while the filter information is pointed to the first value of the next 
secondary keys, which makes the filter information incorrect. 

We actually have several test cases for it, e.g., 
`filters/load-with-secondary-rtree`. However, the test is passed just by 
chance. Since the filter type tag is *double* (comes from the first key of the 
secondary keys), and the search query is *datetime* , and the filter satisfies 
function will only call the rawComparator by comparing the first byte, which 
always matches *index.filterValue < query.filterValue* .  And we happened to 
choose the *$m.send-time < datetime("2012-11-20T10:10:00.000Z")* query. If we 
choose the *>* relation, then nothing is returned. 

I mark it *blocking* because first it's a correctness problem, and it's also 
blocking my *pass-2ndary-filter-to-primary*  patch. Thanks for the help!

  was:
It seems we are not passing the correct(or enough) information to the secondary 
index BulkloadOperator when it is created after the data has been ingested. 
During the bulkload, only the secondary keys are sent to the index, while the 
filter information is pointed to the first value of the next key, which causes 
the filter information is incorrect. 

We actually have several test cases for that, e.g., 
`filters/load-with-secondary-rtree`. However, this test is passed just by 
chance. Since the filter type tag is **double** (comes from the first key of 
the secondary keys), and the search query is **datetime** , and the filter 
satisfies function will only call the rawComparator by comparing the first 
byte, which always matches ** index.filterValue < query.filterValue**.  And we 
happened to choose the ** $m.send-time < datetime("2012-11-20T10:10:00.000Z")** 
query. If we choose the **> ** relation, then nothing is returned. 

I mark it **blocking** because first it's a correctness problem, and it's also 
blocking my *pass-2ndary-filter-to-primary*  patch. Thanks for the help!


> Filter doesn't created correct if create index using bulkload
> -------------------------------------------------------------
>
>                 Key: ASTERIXDB-1905
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1905
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Jianfeng Jia
>            Assignee: Ian Maxon
>            Priority: Blocker
>
> It seems we are not passing the correct(or enough) information to the 
> secondary index BulkloadOperator when it is created afterward if the data has 
> been ingested already. During the bulkload, only the secondary keys are sent 
> to the index, while the filter information is pointed to the first value of 
> the next secondary keys, which makes the filter information incorrect. 
> We actually have several test cases for it, e.g., 
> `filters/load-with-secondary-rtree`. However, the test is passed just by 
> chance. Since the filter type tag is *double* (comes from the first key of 
> the secondary keys), and the search query is *datetime* , and the filter 
> satisfies function will only call the rawComparator by comparing the first 
> byte, which always matches *index.filterValue < query.filterValue* .  And we 
> happened to choose the *$m.send-time < datetime("2012-11-20T10:10:00.000Z")* 
> query. If we choose the *>* relation, then nothing is returned. 
> I mark it *blocking* because first it's a correctness problem, and it's also 
> blocking my *pass-2ndary-filter-to-primary*  patch. Thanks for the help!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to