Aman Sinha created DRILL-886:
--------------------------------

             Summary: Wrong results for a query with Right Outer Join on the 
second (and subsequent) executions
                 Key: DRILL-886
                 URL: https://issues.apache.org/jira/browse/DRILL-886
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Aman Sinha


The following query with a right outer join produces correct results on the 
first execution in a session but wrong results on the second and subsequent 
executions.   A potential cause for the problem can be seen from the two 
Explain plans:  the scan of the nation table shows a difference in the columns 
being projected.  

0: jdbc:drill:zk=local> select n.n_regionkey, r.r_regionkey from 
cp.`tpch/region.parquet` r right join cp.`tpch/nation.parquet` n on  
n.n_regionkey = r.r_regionkey;

+-------------+-------------+
| n_regionkey | r_regionkey |
+-------------+-------------+
| 0           | 0           |
| 0           | 0           |
| 0           | 0           |
| 0           | 0           |
| 0           | 0           |
| 1           | 1           |
| 1           | 1           |
| 1           | 1           |
| 1           | 1           |
| 1           | 1           |
| 2           | 2           |
| 2           | 2           |
| 2           | 2           |
| 2           | 2           |
| 2           | 2           |
| 3           | 3           |
| 3           | 3           |
| 3           | 3           |
| 3           | 3           |
| 3           | 3           |
| 4           | 4           |
| 4           | 4           |
| 4           | 4           |
| 4           | 4           |
| 4           | 4           |
+-------------+-------------+
25 rows selected (2.207 seconds)

0: jdbc:drill:zk=local> select n.n_regionkey, r.r_regionkey from 
cp.`tpch/region.parquet` r right join cp.`tpch/nation.parquet` n on  
n.n_regionkey = r.r_regionkey;
+-------------+-------------+
| n_regionkey | r_regionkey |
+-------------+-------------+
| 0           | null        |
| 1           | null        |
| 1           | null        |
| 1           | null        |
| 4           | null        |
| 0           | null        |
| 3           | null        |
| 3           | null        |
| 2           | null        |
| 2           | null        |
| 4           | null        |
| 4           | null        |
| 2           | null        |
| 4           | null        |
| 0           | null        |
| 0           | null        |
| 0           | null        |
| 1           | null        |
| 2           | null        |
| 3           | null        |
| 4           | null        |
| 2           | null        |
| 3           | null        |
| 3           | null        |
| 1           | null        |
+-------------+-------------+
25 rows selected (0.514 seconds)

EXPLAIN plan for the good run: 

| 00-00    Screen
00-01      Project(n_regionkey=[$0], r_regionkey=[$1])
00-02        Project(n_regionkey=[$3], r_regionkey=[$1])
00-03          HashJoin(condition=[=($3, $1)], joinType=[right])
00-05            Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=/tpch/region.parquet]], selectionRoot=/tpch/region.parquet, 
columns=[SchemaPath [`r_regionkey`]]]])
00-04            Project(*0=[$0], n_regionkey=[$1])
00-06              BroadcastExchange
01-01                Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=/tpch/nation.parquet]], 
selectionRoot=/tpch/nation.parquet, columns=[SchemaPath [`n_regionkey`]]]])

Explain plan for the bad run: 

| 00-00    Screen
00-01      Project(n_regionkey=[$0], r_regionkey=[$1])
00-02        Project(n_regionkey=[$3], r_regionkey=[$1])
00-03          HashJoin(condition=[=($2, $1)], joinType=[right])
00-05            Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=/tpch/region.parquet]], selectionRoot=/tpch/region.parquet, 
columns=[SchemaPath [`r_regionkey`]]]])
00-04            Project(*0=[$0], n_regionkey=[$1])
00-06              BroadcastExchange
01-01                Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=/tpch/nation.parquet]], 
selectionRoot=/tpch/nation.parquet, columns=null]])



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to