[
https://issues.apache.org/jira/browse/IMPALA-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798575#comment-16798575
]
Todd Lipcon commented on IMPALA-3430:
-------------------------------------
One particularly common case (and maybe easiest to implement) is an
uncorrelated subquery. For example, a query like:
{code}
select count(*) from t where c < (select avg(c) from t);
{code}
This gets planned as a nested-loop-join against a one-row table (materialized
from the subquery). In that case it's trivial to take the non-equijoin and
propagate it to a runtime "max" filter on the scan.
(it may be that the fix for this special case falls out of a more general
implementation, but if the general implementation is tough it might be worth
attacking this one because it's relatively common)
> Runtime filter : Extend runtime filter to support Min/Max values for HDFS
> scans
> -------------------------------------------------------------------------------
>
> Key: IMPALA-3430
> URL: https://issues.apache.org/jira/browse/IMPALA-3430
> Project: IMPALA
> Issue Type: New Feature
> Components: Backend
> Affects Versions: Impala 2.6.0
> Reporter: Mostafa Mokhtar
> Priority: Minor
> Labels: performance, runtime-filters
>
> Annotating Runtime filters with Min/Max values can help with
> * Inequality joins
> * Pushing more efficient filters to the scan
> * Used to skip reading Parquet blocks reducing IO.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]