[ 
https://issues.apache.org/jira/browse/IMPALA-14510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-14510:
-----------------------------------
    Attachment: disable_ineffective_filters_tpcds.txt

> Runtime filters with low effectiveness should be disabled more aggressively
> ---------------------------------------------------------------------------
>
>                 Key: IMPALA-14510
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14510
>             Project: IMPALA
>          Issue Type: Task
>          Components: Backend
>    Affects Versions: Impala 5.0.0
>            Reporter: Joe McDonnell
>            Priority: Major
>         Attachments: disable_ineffective_filters_tpcds.txt
>
>
> There is an existing check for a runtime filter's effectiveness in 
> HdfsScanner::CheckFiltersEffectiveness() which can disable runtime filters 
> that are always true or don't meet the min_filter_reject_ratio (default value 
> 0.1). This check runs every BATCHES_PER_FILTER_SELECTIVITY_CHECK (16) batches:
> {noformat}
>     // Always add batch to the queue because it may contain data referenced 
> by previously
>     // appended batches.
>     scan_node->AddMaterializedRowBatch(move(batch));
>     RETURN_IF_ERROR(status);
>     ++row_batches_produced_;
>     if ((row_batches_produced_ & (BATCHES_PER_FILTER_SELECTIVITY_CHECK - 1)) 
> == 0) {
>       CheckFiltersEffectiveness();
>     }{noformat}
>  If there are multiple runtime filters and one of them is very selective, it 
> may not be returning row batches very often. This means that an ineffective 
> filter can be evaluated on many rows before being disabled (or may never be 
> disabled). For example from TPC-DS Q13:
> {noformat}
>  - Rows processed: 47.46M (47458959)
>  - Rows rejected: 0 (0)
>  - Rows total: 47.46M (47458959)
> {noformat}
> We should try to disable ineffective filters more aggressively in this 
> circumstance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to