Windowing query over a view returns wrong results when used with and
without a group by clause. Please let me know if this is a planning bug ?
Postgres does not support the query where we use a group by.
DDL used for view creation was,
create view vwOnParq (col_int, col_bigint, col_char_2, col_vchar_52,
col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as select col_int,
col_bigint, col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln,
col_dbl, col_tm from `tblForView/0_0_0.parquet`;
The two queries are,
0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq;
*+---------+*
*| **EXPR$0 ** |*
*+---------+*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*| *-19 * |*
*+---------+*
30 rows selected (0.26 seconds)
Explain plan for the above query
*| *00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project($0=[$9])
00-03 Window(window#0=[window(partition {} order by [] range
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
00-04 Project(col_int=[$4], col_bigint=[$7], col_char_2=[$2],
col_vchar_52=[$1], col_tmstmp=[$0], col_dt=[$3], col_booln=[$6],
col_dbl=[$8], col_tm=[$5])
00-05 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
columns=[`col_int`, `col_bigint`, `col_char_2`, `col_vchar_52`,
`col_tmstmp`, `col_dt`, `col_booln`, `col_dbl`, `col_tm`]]])
0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq
group by col_char_2;
*+---------+*
*| **EXPR$0 ** |*
*+---------+*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*| *AZ * |*
*+---------+*
18 rows selected (0.27 seconds)
Explain plan for the above query that uses group by
*| *00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project($0=[$2])
00-03 Window(window#0=[window(partition {} order by [] range
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
00-04 HashAgg(group=[{0}], agg#0=[MIN($1)])
00-05 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
columns=[`col_char_2`, `col_int`]]])
Thanks,
Khurram