[jira] [Updated] (ASTERIXDB-3332) Skip pushing columnar range-filter on secondary index with replicate candidate

Wail Y. Alkowaileet (Jira) Fri, 05 Jan 2024 13:29:09 -0800


     [ 
https://issues.apache.org/jira/browse/ASTERIXDB-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wail Y. Alkowaileet updated ASTERIXDB-3332:
-------------------------------------------
    Description: 
When a dataset is scanned (or searched) more than once, we replicate the output 
of such operators and the dataset will be accessed once. 

In columnar datasets, different scans (and searches) can project different 
fields or can have different filters. In such case, we scan (or search) the 
same dataset more than once (due to the different projection and filtration). 
This may induce several issues. For example, if there's a continuous ingestion, 
the two data-scans could produce different tuples. 

To mitigate this issue, projected fields of different scans should be 
consolidated and the filter expressions of the same scans should be normalized 
and consolidated as a single disjunct expression.  

  was:
When the rule ExtractCommonOperatorRule is fired, several physical operators 
are introduced and none of which carries their delivered physical properties. 
For that reason, the compiler cannot determine if the PKs yielded from the 
secondary index are sorted or not – disabling the batch point lookup.

Since the non-batch point-lookups are expensive in columnar, we should disable 
range filter to allow the columnar unnest-map operator to be replicated. 


> Skip pushing columnar range-filter on secondary index with replicate candidate
> ------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-3332
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3332
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: COMP - Compiler
>    Affects Versions: 0.9.9
>            Reporter: Wail Y. Alkowaileet
>            Assignee: Wail Y. Alkowaileet
>            Priority: Major
>              Labels: triaged
>             Fix For: 0.9.9
>
>
> When a dataset is scanned (or searched) more than once, we replicate the 
> output of such operators and the dataset will be accessed once. 
> In columnar datasets, different scans (and searches) can project different 
> fields or can have different filters. In such case, we scan (or search) the 
> same dataset more than once (due to the different projection and filtration). 
> This may induce several issues. For example, if there's a continuous 
> ingestion, the two data-scans could produce different tuples. 
> To mitigate this issue, projected fields of different scans should be 
> consolidated and the filter expressions of the same scans should be 
> normalized and consolidated as a single disjunct expression.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (ASTERIXDB-3332) Skip pushing columnar range-filter on secondary index with replicate candidate

Reply via email to