[
https://issues.apache.org/jira/browse/IMPALA-14510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joe McDonnell updated IMPALA-14510:
-----------------------------------
Attachment: disable_ineffective_filters_tpcds.txt
> Runtime filters with low effectiveness should be disabled more aggressively
> ---------------------------------------------------------------------------
>
> Key: IMPALA-14510
> URL: https://issues.apache.org/jira/browse/IMPALA-14510
> Project: IMPALA
> Issue Type: Task
> Components: Backend
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Priority: Major
> Attachments: disable_ineffective_filters_tpcds.txt
>
>
> There is an existing check for a runtime filter's effectiveness in
> HdfsScanner::CheckFiltersEffectiveness() which can disable runtime filters
> that are always true or don't meet the min_filter_reject_ratio (default value
> 0.1). This check runs every BATCHES_PER_FILTER_SELECTIVITY_CHECK (16) batches:
> {noformat}
> // Always add batch to the queue because it may contain data referenced
> by previously
> // appended batches.
> scan_node->AddMaterializedRowBatch(move(batch));
> RETURN_IF_ERROR(status);
> ++row_batches_produced_;
> if ((row_batches_produced_ & (BATCHES_PER_FILTER_SELECTIVITY_CHECK - 1))
> == 0) {
> CheckFiltersEffectiveness();
> }{noformat}
> If there are multiple runtime filters and one of them is very selective, it
> may not be returning row batches very often. This means that an ineffective
> filter can be evaluated on many rows before being disabled (or may never be
> disabled). For example from TPC-DS Q13:
> {noformat}
> - Rows processed: 47.46M (47458959)
> - Rows rejected: 0 (0)
> - Rows total: 47.46M (47458959)
> {noformat}
> We should try to disable ineffective filters more aggressively in this
> circumstance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]