Victoria Markman created DRILL-3574:
---------------------------------------
Summary: Wrong result with SUM window function in the query with
multiple window definitions
Key: DRILL-3574
URL: https://issues.apache.org/jira/browse/DRILL-3574
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Environment: private-branch-with-multiple-partitions-enabled
Reporter: Victoria Markman
Assignee: Jinfeng Ni
Priority: Critical
Incorrect result:
{code}
0: jdbc:drill:drillbit=localhost> select * from t1;
+-------+--------+-------------+
| a1 | b1 | c1 |
+-------+--------+-------------+
| 1 | aaaaa | 2015-01-01 |
| 2 | bbbbb | 2015-01-02 |
| 3 | ccccc | 2015-01-03 |
| 4 | null | 2015-01-04 |
| 5 | eeeee | 2015-01-05 |
| 6 | fffff | 2015-01-06 |
| 7 | ggggg | 2015-01-07 |
| null | hhhhh | 2015-01-08 |
| 9 | iiiii | null |
| 10 | jjjjj | 2015-01-10 |
+-------+--------+-------------+
10 rows selected (0.087 seconds)
0: jdbc:drill:drillbit=localhost> select
. . . . . . . . . . . . . . . . > a1,
. . . . . . . . . . . . . . . . > sum(a1) over(partition by b1, c1),
. . . . . . . . . . . . . . . . > sum(a1) over()
. . . . . . . . . . . . . . . . > from
. . . . . . . . . . . . . . . . > t1
. . . . . . . . . . . . . . . . > order by
. . . . . . . . . . . . . . . . > a1;
+-------+---------+---------+
| a1 | EXPR$1 | EXPR$2 |
+-------+---------+---------+
| 1 | 1 | 6 |
| 2 | 2 | 6 |
| 3 | 3 | 6 |
| 4 | 4 | 19 |
| 5 | 5 | 22 |
| 6 | 6 | 19 |
| 7 | 7 | 22 |
| 9 | 9 | 19 |
| 10 | 10 | 22 |
| null | null | 6 |
+-------+---------+---------+
10 rows selected (0.165 seconds)
0: jdbc:drill:drillbit=localhost> explain plan for select
. . . . . . . . . . . . . . . . > a1,
. . . . . . . . . . . . . . . . > sum(a1) over(partition by b1, c1),
. . . . . . . . . . . . . . . . > sum(a1) over()
. . . . . . . . . . . . . . . . > from
. . . . . . . . . . . . . . . . > t1
. . . . . . . . . . . . . . . . > order by
. . . . . . . . . . . . . . . . > a1;
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 ProjectAllowDup(a1=[$0], EXPR$1=[$1], EXPR$2=[$2])
00-02 SingleMergeExchange(sort0=[0 ASC])
01-01 SelectionVectorRemover
01-02 Sort(sort0=[$0], dir0=[ASC])
01-03 Project(a1=[$0], w0$o0=[$1], w1$o0=[$2])
01-04 HashToRandomExchange(dist0=[[$0]])
02-01 UnorderedMuxExchange
03-01 Project(a1=[$0], w0$o0=[$1], w1$o0=[$2],
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
03-02 Project(a1=[$1], w0$o0=[$4], w1$o0=[$5])
03-03 Window(window#0=[window(partition {} order by []
range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
03-04 Window(window#0=[window(partition {2, 3} order
by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs
[SUM($1)])])
03-05 SelectionVectorRemover
03-06 Sort(sort0=[$2], sort1=[$3], dir0=[ASC],
dir1=[ASC])
03-07 Project(T154¦¦*=[$0], a1=[$1], b1=[$2],
c1=[$3])
03-08 HashToRandomExchange(dist0=[[$2]],
dist1=[[$3]])
04-01 UnorderedMuxExchange
05-01 Project(T154¦¦*=[$0], a1=[$1],
b1=[$2], c1=[$3], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($3,
hash64AsDouble($2)))])
05-02 Project(T154¦¦*=[$0], a1=[$1],
b1=[$2], c1=[$3])
05-03 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath
[path=file:/Users/vmarkman/drill/testdata/subqueries/t1]],
selectionRoot=file:/Users/vmarkman/drill/testdata/subqueries/t1, numFiles=1,
columns=[`*`]]])
{code}
Correct result:
{code}
0: jdbc:drill:drillbit=localhost> select
. . . . . . . . . . . . . . . . > a1,
. . . . . . . . . . . . . . . . > sum(a1) over(partition by b1, c1),
. . . . . . . . . . . . . . . . > sum(a1) over()
. . . . . . . . . . . . . . . . > from
. . . . . . . . . . . . . . . . > t1
. . . . . . . . . . . . . . . . > order by
. . . . . . . . . . . . . . . . > a1;
+-------+---------+---------+
| a1 | EXPR$1 | EXPR$2 |
+-------+---------+---------+
| 1 | 1 | 47 |
| 2 | 2 | 47 |
| 3 | 3 | 47 |
| 4 | 4 | 47 |
| 5 | 5 | 47 |
| 6 | 6 | 47 |
| 7 | 7 | 47 |
| 9 | 9 | 47 |
| 10 | 10 | 47 |
| null | null | 47 |
+-------+---------+---------+
10 rows selected (0.117 seconds)
0: jdbc:drill:drillbit=localhost> explain plan for select
. . . . . . . . . . . . . . . . > a1,
. . . . . . . . . . . . . . . . > sum(a1) over(partition by b1, c1),
. . . . . . . . . . . . . . . . > sum(a1) over()
. . . . . . . . . . . . . . . . > from
. . . . . . . . . . . . . . . . > t1
. . . . . . . . . . . . . . . . > order by
. . . . . . . . . . . . . . . . > a1;
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 ProjectAllowDup(a1=[$0], EXPR$1=[$1], EXPR$2=[$2])
00-02 SelectionVectorRemover
00-03 Sort(sort0=[$0], dir0=[ASC])
00-04 Project(a1=[$1], w0$o0=[$4], w1$o0=[$5])
00-05 Window(window#0=[window(partition {} order by [] range
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
00-06 Window(window#0=[window(partition {2, 3} order by [] range
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
00-07 SelectionVectorRemover
00-08 Sort(sort0=[$2], sort1=[$3], dir0=[ASC], dir1=[ASC])
00-09 Project(T157¦¦*=[$0], a1=[$1], b1=[$2], c1=[$3])
00-10 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath
[path=file:/Users/vmarkman/drill/testdata/subqueries/t1]],
selectionRoot=file:/Users/vmarkman/drill/testdata/subqueries/t1, numFiles=1,
columns=[`*`]]])
{code}
Reproduction:
{code}
alter session set `planner.slice_target` = 1;
select
a1,
sum(a1) over(partition by b1, c1),
sum(a1) over()
from
t1
order by
a1;
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)