[
https://issues.apache.org/jira/browse/DRILL-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642849#comment-14642849
]
Deneche A. Hakim commented on DRILL-3412:
-----------------------------------------
Interestingly, adding a constant in one of the window functions will push the
projections below the Window operator:
Query without constant:
{noformat}
0: jdbc:drill:zk=local> explain plan for SELECT RANK() OVER (PARTITION BY
ss.ss_store_sk ORDER BY ss.ss_store_sk) FROM store_sales ss LIMIT 20;
00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 SelectionVectorRemover
00-03 Limit(fetch=[20])
00-04 UnionExchange
01-01 Project(w0$o0=[$2])
01-02 Window(window#0=[window(partition {1} order by [1] range
between UNBOUNDED PRECEDING and CURRENT ROW aggs [RANK()])])
01-03 SelectionVectorRemover
01-04 Sort(sort0=[$1], sort1=[$1], dir0=[ASC], dir1=[ASC])
01-05 Project(T3¦¦*=[$0], ss_store_sk=[$1])
01-06 HashToRandomExchange(dist0=[[$1]])
02-01 UnorderedMuxExchange
03-01 Project(T3¦¦*=[$0], ss_store_sk=[$1],
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1))])
03-02 Project(T3¦¦*=[$0], ss_store_sk=[$1])
03-03 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath
[path=file:/Users/hakim/MapR/data/tpcds100/parquet/store_sales]],
selectionRoot=file:/Users/hakim/MapR/data/tpcds100/parquet/store_sales,
numFiles=1, columns=[`*`]]])
{noformat}
Query with constant in ORDER BY clause (the query is still the same because we
were ordering on the partition clause):
{noformat}
0: jdbc:drill:zk=local> explain plan for SELECT RANK() OVER (PARTITION BY
ss.ss_store_sk ORDER BY 1) FROM store_sales ss LIMIT 20;
00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 SelectionVectorRemover
00-03 Limit(fetch=[20])
00-04 UnionExchange
01-01 Project($0=[$1])
01-02 Window(window#0=[window(partition {0} order by [] range
between UNBOUNDED PRECEDING and CURRENT ROW aggs [RANK()])])
01-03 SelectionVectorRemover
01-04 Sort(sort0=[$0], dir0=[ASC])
01-05 Project(ss_store_sk=[$0])
01-06 HashToRandomExchange(dist0=[[$0]])
02-01 UnorderedMuxExchange
03-01 Project(ss_store_sk=[$0],
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
03-02 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath
[path=file:/Users/hakim/MapR/data/tpcds100/parquet/store_sales]],
selectionRoot=file:/Users/hakim/MapR/data/tpcds100/parquet/store_sales,
numFiles=1, columns=[`ss_store_sk`]]])
{noformat}
Query with a constant in a different window function COUNT(1):
{noformat}
0: jdbc:drill:zk=local> explain plan for SELECT COUNT(1) OVER(PARTITION BY
ss.ss_store_sk ORDER BY ss.ss_store_sk), RANK() OVER (PARTITION BY
ss.ss_store_sk ORDER BY ss.ss_store_sk) FROM store_sales ss LIMIT 20;
00-00 Screen
00-01 Project(EXPR$0=[$0], EXPR$1=[$1])
00-02 SelectionVectorRemover
00-03 Limit(fetch=[20])
00-04 UnionExchange
01-01 Project($0=[$1], $1=[$2])
01-02 Window(window#0=[window(partition {0} order by [0] range
between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($1), RANK()])])
01-03 SelectionVectorRemover
01-04 Sort(sort0=[$0], sort1=[$0], dir0=[ASC], dir1=[ASC])
01-05 Project(ss_store_sk=[$0])
01-06 HashToRandomExchange(dist0=[[$0]])
02-01 UnorderedMuxExchange
03-01 Project(ss_store_sk=[$0],
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
03-02 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath
[path=file:/Users/hakim/MapR/data/tpcds100/parquet/store_sales]],
selectionRoot=file:/Users/hakim/MapR/data/tpcds100/parquet/store_sales,
numFiles=1, columns=[`ss_store_sk`]]])
{noformat}
> Projections are not getting push down below Window operator
> -----------------------------------------------------------
>
> Key: DRILL-3412
> URL: https://issues.apache.org/jira/browse/DRILL-3412
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Reporter: Aman Sinha
> Assignee: Jinfeng Ni
> Priority: Blocker
> Labels: window_function
> Fix For: 1.2.0
>
>
> The plan below shows that the 'star' column is being produced by the Scan and
> subsequent Project. This indicates projection pushdown is not working as
> desired when window function is present. The query produces correct results.
> {code}
> explain plan for select min(n_nationkey) over (partition by n_regionkey) from
> cp.`tpch/nation.parquet` ;
> 00-00 Screen
> 00-01 Project(EXPR$0=[$0])
> 00-02 Project(w0$o0=[$3])
> 00-03 Window(window#0=[window(partition {2} order by [] range
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($1)])])
> 00-04 SelectionVectorRemover
> 00-05 Sort(sort0=[$2], dir0=[ASC])
> 00-06 Project(T1¦¦*=[$0], n_nationkey=[$1], n_regionkey=[$2])
> 00-07 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=classpath:/tpch/nation.parquet]],
> selectionRoot=/tpch/nation.parquet, numFiles=1, columns=[`*`]]])
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)