[
https://issues.apache.org/jira/browse/ASTERIXDB-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ian Maxon updated ASTERIXDB-3523:
---------------------------------
Labels: triaged (was: )
> Eliminating Non-Matching Secondary Keys After Secondary Index Search
> --------------------------------------------------------------------
>
> Key: ASTERIXDB-3523
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-3523
> Project: Apache AsterixDB
> Issue Type: Improvement
> Components: COMP - Compiler
> Reporter: shahrzad shirazi
> Priority: Minor
> Labels: triaged
>
> In lower- or upper-bounded range queries, especially when data is
> heterogeneous or contains many null values, a secondary index search can
> return numerous records that don’t ultimately match the query conditions.
> These records proceed to the primary index search but are eliminated after
> the primary index search.
> For example, consider the following queries on a *customers* dataset with a
> secondary index on the *age* field which is not in the datatype:
> {*}Query 1{*}:
> {code:java}
> SELECT * FROM customers c WHERE c.age < 20; {code}
> If many records have null or missing age values, the secondary index search
> will return numerous keys, which will pass through the primary index search
> but be filtered out afterward.
> {*}Query 2{*}:
> {code:java}
> SELECT * FROM customers c WHERE c.age > 40; {code}
> Similarly, if there are records with non-numeric values in the age field,
> these will be included in the secondary index results and pass through the
> primary index search but be filtered out afterward.
>
> A solution to this inefficiency is to add a selection operator immediately
> after the secondary index search. This operator would filter out secondary
> keys that don’t meet the query criteria before they proceed to the primary
> index search, reducing unnecessary processing and improving overall
> efficiency.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)