The second query (below) that uses group by is not supported by Postgres, I will file a JIRA to block that query.
SELECT MIN(col_int) OVER() FROM vwOnParq group by col_char_2; Output from Postgres postgres=# select min(col_int) over() from all_typs_tbl group by col_char_2; ERROR: column "all_typs_tbl.col_int" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: select min(col_int) over() from all_typs_tbl group by col_ch... Querying over that original parquet file using which the view was created, we see an assertion error 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM `tblForView/0_0_0.parquet` group by col_char_2; *Error: SYSTEM ERROR: java.lang.AssertionError: Internal error: while converting MIN(`tblForView/0_0_0.parquet`.`col_int`)* *[Error Id: e8ed279d-aa8c-4db1-9906-5dd7fdecaac2 on centos-02.qa.lab:31010] (state=,code=0)* On Tue, Jun 23, 2015 at 5:31 PM, Abdel Hakim Deneche <[email protected]> wrote: > What happens if you run the queries on the original parquet files and not > the views ? > > On Tue, Jun 23, 2015 at 5:28 PM, Khurram Faraaz <[email protected]> > wrote: > > > Windowing query over a view returns wrong results when used with and > > without a group by clause. Please let me know if this is a planning bug ? > > Postgres does not support the query where we use a group by. > > > > DDL used for view creation was, > > > > create view vwOnParq (col_int, col_bigint, col_char_2, col_vchar_52, > > col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as select col_int, > > col_bigint, col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln, > > col_dbl, col_tm from `tblForView/0_0_0.parquet`; > > > > > > The two queries are, > > > > > > 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq; > > > > *+---------+* > > > > *| **EXPR$0 ** |* > > > > *+---------+* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *| *-19 * |* > > > > *+---------+* > > > > 30 rows selected (0.26 seconds) > > > > > > Explain plan for the above query > > > > > > *| *00-00 Screen > > > > 00-01 Project(EXPR$0=[$0]) > > > > 00-02 Project($0=[$9]) > > > > 00-03 Window(window#0=[window(partition {} order by [] range > > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])]) > > > > 00-04 Project(col_int=[$4], col_bigint=[$7], col_char_2=[$2], > > col_vchar_52=[$1], col_tmstmp=[$0], col_dt=[$3], col_booln=[$6], > > col_dbl=[$8], col_tm=[$5]) > > > > 00-05 Scan(groupscan=[ParquetGroupScan > > [entries=[ReadEntryWithPath > [path=maprfs:///tmp/tblForView/0_0_0.parquet]], > > selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1, > > columns=[`col_int`, `col_bigint`, `col_char_2`, `col_vchar_52`, > > `col_tmstmp`, `col_dt`, `col_booln`, `col_dbl`, `col_tm`]]]) > > > > > > 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq > > group by col_char_2; > > > > *+---------+* > > > > *| **EXPR$0 ** |* > > > > *+---------+* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *| *AZ * |* > > > > *+---------+* > > > > 18 rows selected (0.27 seconds) > > > > > > Explain plan for the above query that uses group by > > > > > > *| *00-00 Screen > > > > 00-01 Project(EXPR$0=[$0]) > > > > 00-02 Project($0=[$2]) > > > > 00-03 Window(window#0=[window(partition {} order by [] range > > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])]) > > > > 00-04 HashAgg(group=[{0}], agg#0=[MIN($1)]) > > > > 00-05 Scan(groupscan=[ParquetGroupScan > > [entries=[ReadEntryWithPath > [path=maprfs:///tmp/tblForView/0_0_0.parquet]], > > selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1, > > columns=[`col_char_2`, `col_int`]]]) > > > > > > Thanks, > > > > Khurram > > > > > > -- > > Abdelhakim Deneche > > Software Engineer > > <http://www.mapr.com/> > > > Now Available - Free Hadoop On-Demand Training > < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > >
