[
https://issues.apache.org/jira/browse/DRILL-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Khurram Faraaz closed DRILL-3680.
---------------------------------
Verified fix on master commit id: b525692e
added test
{code}
0: jdbc:drill:schema=dfs.tmp> select c1 , c2 , lead(c2) OVER ( PARTITION BY c2
ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2 ORDER
BY c1) FROM `tblWnulls.parquet`);
+-------------+-------+----------+
| c1 | c2 | lead_c2 |
+-------------+-------+----------+
| 0 | a | a |
| 1 | a | a |
| 5 | a | a |
| 10 | a | a |
| 11 | a | a |
| 14 | a | a |
| 11111 | a | null |
| 2 | b | b |
| 9 | b | b |
| 13 | b | b |
| 17 | b | null |
| 4 | c | c |
| 6 | c | c |
| 8 | c | c |
| 12 | c | c |
| 13 | c | c |
| 13 | c | c |
| null | c | null |
| 10 | d | d |
| 11 | d | d |
| 2147483647 | d | d |
| 2147483647 | d | d |
| null | d | d |
| null | d | null |
| -1 | e | e |
| 15 | e | null |
| 19 | null | null |
| 65536 | null | null |
| 1000000 | null | null |
| null | null | null |
+-------------+-------+----------+
30 rows selected (0.579 seconds)
{code}
> window function query returns Incorrect results
> ------------------------------------------------
>
> Key: DRILL-3680
> URL: https://issues.apache.org/jira/browse/DRILL-3680
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 1.2.0
> Environment: private-branch
> https://github.com/adeneche/incubator-drill/tree/new-window-funcs
> Reporter: Khurram Faraaz
> Assignee: Jinfeng Ni
> Priority: Critical
> Labels: window_function
> Fix For: 1.2.0
>
> Attachments:
> 0001-DRILL-3680-Fix-incorrect-query-result-or-IOBE-when-t.patch,
> 0001-DRILL-3680-Fix-incorrect-query-result-or-IOBE-when-t.patch.2,
> tblWnulls.parquet
>
>
> Query plan from Drill for the query that returns wrong results
> {code}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select c1 , c2 , lead(c2) OVER
> ( PARTITION BY c2 ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3)
> over(PARTITION BY c2 ORDER BY c1) FROM `tblWnulls.parquet`);
> +------+------+
> | text | json |
> +------+------+
> | 00-00 Screen
> 00-01 Project(c1=[$0], c2=[$1], lead_c2=[$2])
> 00-02 Project(c1=[$0], c2=[$1], lead_c2=[$2])
> 00-03 Project(c1=[$0], c2=[$1], $2=[$3])
> 00-04 Window(window#0=[window(partition {1} order by [0] range
> between UNBOUNDED PRECEDING and CURRENT ROW aggs [LEAD($1)])])
> 00-05 Window(window#0=[window(partition {1} order by [0] range
> between UNBOUNDED PRECEDING and CURRENT ROW aggs [NTILE($2)])])
> 00-06 SelectionVectorRemover
> 00-07 Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
> 00-08 Project(c1=[$1], c2=[$0])
> 00-09 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls.parquet]],
> selectionRoot=maprfs:/tmp/tblWnulls.parquet, numFiles=1, columns=[`c1`,
> `c2`]]])
> {code}
> Results returned by Drill.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select c1 , c2 , lead(c2) OVER ( PARTITION BY
> c2 ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2
> ORDER BY c1) FROM `tblWnulls.parquet`);
> +-------------+-------+----------+
> | c1 | c2 | lead_c2 |
> +-------------+-------+----------+
> | 0 | a | null |
> | 1 | a | null |
> | 5 | a | null |
> | 10 | a | null |
> | 11 | a | null |
> | 14 | a | null |
> | 11111 | a | null |
> | 2 | b | null |
> | 9 | b | null |
> | 13 | b | null |
> | 17 | b | null |
> | 4 | c | null |
> | 6 | c | null |
> | 8 | c | null |
> | 12 | c | null |
> | 13 | c | null |
> | 13 | c | null |
> | null | c | null |
> | 10 | d | null |
> | 11 | d | null |
> | 2147483647 | d | null |
> | 2147483647 | d | null |
> | null | d | null |
> | null | d | null |
> | -1 | e | null |
> | 15 | e | null |
> | 19 | null | null |
> | 65536 | null | null |
> | 1000000 | null | null |
> | null | null | null |
> +-------------+-------+----------+
> 30 rows selected (0.339 seconds)
> {code}
> Results returned by Postgres
> {code}
> postgres=# select c1 , c2 , lead(c2) OVER ( PARTITION BY c2 ORDER BY c1)
> lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2 ORDER BY c1) FROM
> t222) sub_query;
> c1 | c2 | lead_c2
> ------------+----+---------
> 0 | a | a
> 1 | a | a
> 5 | a | a
> 10 | a | a
> 11 | a | a
> 14 | a | a
> 11111 | a |
> 2 | b | b
> 9 | b | b
> 13 | b | b
> 17 | b |
> 4 | c | c
> 6 | c | c
> 8 | c | c
> 12 | c | c
> 13 | c | c
> 13 | c | c
> | c |
> 10 | d | d
> 11 | d | d
> 2147483647 | d | d
> 2147483647 | d | d
> | d | d
> | d |
> -1 | e | e
> 15 | e |
> 19 | |
> 65536 | |
> 1000000 | |
> | |
> (30 rows)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)