[
https://issues.apache.org/jira/browse/DRILL-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jinfeng Ni resolved DRILL-3716.
-------------------------------
Resolution: Fixed
> Drill should push filter past aggregate in order to improve query performance.
> ------------------------------------------------------------------------------
>
> Key: DRILL-3716
> URL: https://issues.apache.org/jira/browse/DRILL-3716
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Reporter: Jinfeng Ni
> Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
>
> For the following query which has a filter on top of an aggregation, Drill's
> currently push the filter pass through the aggregation. As a result, we may
> miss some optimization opportunity. For instance, such filter could
> potentially been pushed into scan if it qualifies for partition pruning.
> For the following query:
> {code}
> select n_regionkey, cnt from
> (select n_regionkey, count(*) cnt
> from (select n.n_nationkey, n.n_regionkey, n.n_name
> from cp.`tpch/nation.parquet` n
> left join
> cp.`tpch/region.parquet` r
> on n.n_regionkey = r.r_regionkey)
> group by n_regionkey)
> where n_regionkey = 2;
> {code}
> The current plan shows a filter (00-04) on top of aggregation(00-05). The
> better plan would have the filter pushed pass the aggregation.
> The root cause of this problem is Drill's ruleset does not include
> FilterAggregateTransoposeRule from Calcite library.
> {code}
> 00-01 Project(n_regionkey=[$0], cnt=[$1])
> 00-02 Project(n_regionkey=[$0], cnt=[$1])
> 00-03 SelectionVectorRemover
> 00-04 Filter(condition=[=($0, 2)])
> 00-05 StreamAgg(group=[{0}], cnt=[COUNT()])
> 00-06 Project(n_regionkey=[$0])
> 00-07 MergeJoin(condition=[=($0, $1)], joinType=[left])
> 00-09 SelectionVectorRemover
> 00-11 Sort(sort0=[$0], dir0=[ASC])
> 00-13 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=classpath:/tpch/nation.parquet]],
> selectionRoot=classpath:/tpch/nation.parquet, numFiles=1,
> columns=[`n_regionkey`]]])
> 00-08 SelectionVectorRemover
> 00-10 Sort(sort0=[$0], dir0=[ASC])
> 00-12 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=classpath:/tpch/region.parquet]],
> selectionRoot=classpath:/tpch/region.parquet, numFiles=1,
> columns=[`r_regionkey`]]])
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)