Victoria Markman created DRILL-2204:
---------------------------------------

             Summary: DISTINCT statement over UNION ALL subquery asserts during 
execution with streaming aggregation
                 Key: DRILL-2204
                 URL: https://issues.apache.org/jira/browse/DRILL-2204
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Data Types
    Affects Versions: 0.8.0
            Reporter: Victoria Markman
            Assignee: Daniel Barclay (Drill/MapR)


{code}
0: jdbc:drill:schema=dfs> select distinct sq.x1, sq.x2, sq.x3 from ( select a1, 
b1, c1 from t1 union all select a2, b2, c2 from t2 ) as sq(x1,x2,x3);
+------------+------------+------------+
|     x1     |     x2     |     x3     |
+------------+------------+------------+
Query failed: RemoteRpcException: Failure while running fragment., Failure 
while reading vector.  Expected vector class of 
org.apache.drill.exec.vector.NullableVarCharVector but was holding vector class 
org.apache.drill.exec.vector.NullableIntVector. [ 
dd2fedd7-bbee-40a4-8a26-9ca86fa774f6 on atsqa4-134.qa.lab:31010 ]
[ dd2fedd7-bbee-40a4-8a26-9ca86fa774f6 on atsqa4-134.qa.lab:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
query.
        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
        at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
        at sqlline.SqlLine.print(SqlLine.java:1809)
        at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
        at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
        at sqlline.SqlLine.dispatch(SqlLine.java:889)
        at sqlline.SqlLine.begin(SqlLine.java:763)
        at sqlline.SqlLine.start(SqlLine.java:498)
        at sqlline.SqlLine.main(SqlLine.java:460)
{code}

Plan:
{code}
00-01      Project(x1=[$0], x2=[$1], x3=[$2])
00-02        StreamAgg(group=[{0, 1, 2}])
00-03          Sort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], dir1=[ASC], 
dir2=[ASC])
00-04            Project(x1=[$0], x2=[$1], x3=[$2])
00-05              UnionAll(all=[true])
00-07                Project(a1=[$2], b1=[$1], c1=[$0])
00-09                  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:/aggregation/sanity/t1]], 
selectionRoot=/aggregation/sanity/t1, numFiles=1, columns=[`a1`, `b1`, `c1`]]])
00-06                Project(a2=[$1], b2=[$0], c2=[$2])
00-08                  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:/aggregation/sanity/t2]], 
selectionRoot=/aggregation/sanity/t2, numFiles=1, columns=[`a2`, `b2`, `c2`]]])
{code}

Query works if columns in the union query have the same name.
{code}
0: jdbc:drill:schema=dfs> select distinct sq.x1, sq.x2, sq.x3 from ( select a1, 
b1, c1 from t1 union all select a1, b1, c1 from t4 ) as sq(x1,x2,x3);
+------------+------------+------------+
|     x1     |     x2     |     x3     |
+------------+------------+------------+
| 1          | aaaaa      | 2015-01-01 |
| 2          | bbbbb      | 2015-01-02 |
| 3          | ccccc      | 2015-01-03 |
| 4          | null       | 2015-01-04 |
| 5          | eeeee      | 2015-01-05 |
| 6          | fffff      | 2015-01-06 |
| 7          | ggggg      | 2015-01-07 |
| 9          | iiiii      | null       |
| 10         | jjjjj      | 2015-01-10 |
| null       | hhhhh      | 2015-01-08 |
+------------+------------+------------+
10 rows selected (0.131 seconds)
{code}

It's possible, that bug is caused by drill-2203, but I'm filing it anyway, 
because the way it fails it is different and will need to be verified after it 
is fixed. Tables for the query are attached in drill-2203




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to