[
https://issues.apache.org/jira/browse/DRILL-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Victoria Markman updated DRILL-2204:
------------------------------------
Component/s: (was: Execution - Data Types)
Execution - Flow
> DISTINCT statement over UNION ALL subquery asserts during execution with
> streaming aggregation
> ----------------------------------------------------------------------------------------------
>
> Key: DRILL-2204
> URL: https://issues.apache.org/jira/browse/DRILL-2204
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 0.8.0
> Reporter: Victoria Markman
> Assignee: Daniel Barclay (Drill/MapR)
>
> {code}
> 0: jdbc:drill:schema=dfs> select distinct sq.x1, sq.x2, sq.x3 from ( select
> a1, b1, c1 from t1 union all select a2, b2, c2 from t2 ) as sq(x1,x2,x3);
> +------------+------------+------------+
> | x1 | x2 | x3 |
> +------------+------------+------------+
> Query failed: RemoteRpcException: Failure while running fragment., Failure
> while reading vector. Expected vector class of
> org.apache.drill.exec.vector.NullableVarCharVector but was holding vector
> class org.apache.drill.exec.vector.NullableIntVector. [
> dd2fedd7-bbee-40a4-8a26-9ca86fa774f6 on atsqa4-134.qa.lab:31010 ]
> [ dd2fedd7-bbee-40a4-8a26-9ca86fa774f6 on atsqa4-134.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing
> query.
> at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
> at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
> at sqlline.SqlLine.print(SqlLine.java:1809)
> at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
> at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
> at sqlline.SqlLine.dispatch(SqlLine.java:889)
> at sqlline.SqlLine.begin(SqlLine.java:763)
> at sqlline.SqlLine.start(SqlLine.java:498)
> at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> Plan:
> {code}
> 00-01 Project(x1=[$0], x2=[$1], x3=[$2])
> 00-02 StreamAgg(group=[{0, 1, 2}])
> 00-03 Sort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC],
> dir1=[ASC], dir2=[ASC])
> 00-04 Project(x1=[$0], x2=[$1], x3=[$2])
> 00-05 UnionAll(all=[true])
> 00-07 Project(a1=[$2], b1=[$1], c1=[$0])
> 00-09 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:/aggregation/sanity/t1]],
> selectionRoot=/aggregation/sanity/t1, numFiles=1, columns=[`a1`, `b1`,
> `c1`]]])
> 00-06 Project(a2=[$1], b2=[$0], c2=[$2])
> 00-08 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:/aggregation/sanity/t2]],
> selectionRoot=/aggregation/sanity/t2, numFiles=1, columns=[`a2`, `b2`,
> `c2`]]])
> {code}
> Query works if columns in the union query have the same name.
> {code}
> 0: jdbc:drill:schema=dfs> select distinct sq.x1, sq.x2, sq.x3 from ( select
> a1, b1, c1 from t1 union all select a1, b1, c1 from t4 ) as sq(x1,x2,x3);
> +------------+------------+------------+
> | x1 | x2 | x3 |
> +------------+------------+------------+
> | 1 | aaaaa | 2015-01-01 |
> | 2 | bbbbb | 2015-01-02 |
> | 3 | ccccc | 2015-01-03 |
> | 4 | null | 2015-01-04 |
> | 5 | eeeee | 2015-01-05 |
> | 6 | fffff | 2015-01-06 |
> | 7 | ggggg | 2015-01-07 |
> | 9 | iiiii | null |
> | 10 | jjjjj | 2015-01-10 |
> | null | hhhhh | 2015-01-08 |
> +------------+------------+------------+
> 10 rows selected (0.131 seconds)
> {code}
> It's possible, that bug is caused by drill-2203, but I'm filing it anyway,
> because the way it fails it is different and will need to be verified after
> it is fixed. Tables for the query are attached in drill-2203
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)