[
https://issues.apache.org/jira/browse/ASTERIXDB-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jianfeng Jia updated ASTERIXDB-1905:
------------------------------------
Description:
It seems we are not passing the correct(or enough) information to the secondary
index BulkloadOperator when it is created afterward if the data has been
ingested already. During the bulkload, only the secondary keys are sent to the
index, while the filter information is pointed to the first value of the next
secondary keys, which makes the filter information incorrect.
We actually have several test cases for it, e.g.,
`filters/load-with-secondary-rtree`. However, the test is passed just by
chance. Since the filter type tag is *double* (comes from the first key of the
secondary keys), and the search query is *datetime* , and the filter satisfies
function will only call the rawComparator by comparing the first byte, which
always matches *index.filterValue < query.filterValue* . And we happened to
choose the *$m.send-time < datetime("2012-11-20T10:10:00.000Z")* query. If we
choose the *>* relation, then nothing is returned.
I mark it *blocking* because first it's a correctness problem, and it's also
blocking my *pass-2ndary-filter-to-primary* patch. Thanks for the help!
was:
It seems we are not passing the correct(or enough) information to the secondary
index BulkloadOperator when it is created after the data has been ingested.
During the bulkload, only the secondary keys are sent to the index, while the
filter information is pointed to the first value of the next key, which causes
the filter information is incorrect.
We actually have several test cases for that, e.g.,
`filters/load-with-secondary-rtree`. However, this test is passed just by
chance. Since the filter type tag is **double** (comes from the first key of
the secondary keys), and the search query is **datetime** , and the filter
satisfies function will only call the rawComparator by comparing the first
byte, which always matches ** index.filterValue < query.filterValue**. And we
happened to choose the ** $m.send-time < datetime("2012-11-20T10:10:00.000Z")**
query. If we choose the **> ** relation, then nothing is returned.
I mark it **blocking** because first it's a correctness problem, and it's also
blocking my *pass-2ndary-filter-to-primary* patch. Thanks for the help!
> Filter doesn't created correct if create index using bulkload
> -------------------------------------------------------------
>
> Key: ASTERIXDB-1905
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1905
> Project: Apache AsterixDB
> Issue Type: Bug
> Reporter: Jianfeng Jia
> Assignee: Ian Maxon
> Priority: Blocker
>
> It seems we are not passing the correct(or enough) information to the
> secondary index BulkloadOperator when it is created afterward if the data has
> been ingested already. During the bulkload, only the secondary keys are sent
> to the index, while the filter information is pointed to the first value of
> the next secondary keys, which makes the filter information incorrect.
> We actually have several test cases for it, e.g.,
> `filters/load-with-secondary-rtree`. However, the test is passed just by
> chance. Since the filter type tag is *double* (comes from the first key of
> the secondary keys), and the search query is *datetime* , and the filter
> satisfies function will only call the rawComparator by comparing the first
> byte, which always matches *index.filterValue < query.filterValue* . And we
> happened to choose the *$m.send-time < datetime("2012-11-20T10:10:00.000Z")*
> query. If we choose the *>* relation, then nothing is returned.
> I mark it *blocking* because first it's a correctness problem, and it's also
> blocking my *pass-2ndary-filter-to-primary* patch. Thanks for the help!
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)