[ 
https://issues.apache.org/jira/browse/ASTERIXDB-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Maxon updated ASTERIXDB-3523:
---------------------------------
    Labels: triaged  (was: )

> Eliminating Non-Matching Secondary Keys After Secondary Index Search
> --------------------------------------------------------------------
>
>                 Key: ASTERIXDB-3523
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3523
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>          Components: COMP - Compiler
>            Reporter: shahrzad shirazi
>            Priority: Minor
>              Labels: triaged
>
> In lower- or upper-bounded range queries, especially when data is 
> heterogeneous or contains many null values, a secondary index search can 
> return numerous records that don’t ultimately match the query conditions. 
> These records proceed to the primary index search but are eliminated after 
> the primary index search.
> For example, consider the following queries on a *customers* dataset with a 
> secondary index on the *age* field which is not in the datatype:
> {*}Query 1{*}:
> {code:java}
> SELECT * FROM customers c WHERE c.age < 20; {code}
> If many records have null or missing age values, the secondary index search 
> will return numerous keys, which will pass through the primary index search 
> but be filtered out afterward.
> {*}Query 2{*}:
> {code:java}
> SELECT * FROM customers c WHERE c.age > 40; {code}
> Similarly, if there are records with non-numeric values in the age field, 
> these will be included in the secondary index results and pass through the 
> primary index search but be filtered out afterward.
>  
> A solution to this inefficiency is to add a selection operator immediately 
> after the secondary index search. This operator would filter out secondary 
> keys that don’t meet the query criteria before they proceed to the primary 
> index search, reducing unnecessary processing and improving overall 
> efficiency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to