[
https://issues.apache.org/jira/browse/ASTERIXDB-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wail Y. Alkowaileet updated ASTERIXDB-3332:
-------------------------------------------
Description:
When a dataset is scanned (or searched) more than once, we replicate the output
of such operators and the dataset will be accessed once.
In columnar datasets, different scans (and searches) can project different
fields or can have different filters. In such case, we scan (or search) the
same dataset more than once (due to the different projection and filtration).
This may induce several issues. For example, if there's a continuous ingestion,
the two data-scans could produce different tuples.
To mitigate this issue, projected fields of different scans should be
consolidated and the filter expressions of the same scans should be normalized
and consolidated as a single disjunct expression.
was:
When the rule ExtractCommonOperatorRule is fired, several physical operators
are introduced and none of which carries their delivered physical properties.
For that reason, the compiler cannot determine if the PKs yielded from the
secondary index are sorted or not – disabling the batch point lookup.
Since the non-batch point-lookups are expensive in columnar, we should disable
range filter to allow the columnar unnest-map operator to be replicated.
> Skip pushing columnar range-filter on secondary index with replicate candidate
> ------------------------------------------------------------------------------
>
> Key: ASTERIXDB-3332
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-3332
> Project: Apache AsterixDB
> Issue Type: Bug
> Components: COMP - Compiler
> Affects Versions: 0.9.9
> Reporter: Wail Y. Alkowaileet
> Assignee: Wail Y. Alkowaileet
> Priority: Major
> Labels: triaged
> Fix For: 0.9.9
>
>
> When a dataset is scanned (or searched) more than once, we replicate the
> output of such operators and the dataset will be accessed once.
> In columnar datasets, different scans (and searches) can project different
> fields or can have different filters. In such case, we scan (or search) the
> same dataset more than once (due to the different projection and filtration).
> This may induce several issues. For example, if there's a continuous
> ingestion, the two data-scans could produce different tuples.
> To mitigate this issue, projected fields of different scans should be
> consolidated and the filter expressions of the same scans should be
> normalized and consolidated as a single disjunct expression.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)