[ https://issues.apache.org/jira/browse/DRILL-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jacques Nadeau updated DRILL-886: --------------------------------- Assignee: Suresh Ollala (was: Aman Sinha) > Wrong results for a query with Right Outer Join on the second (and > subsequent) executions > ----------------------------------------------------------------------------------------- > > Key: DRILL-886 > URL: https://issues.apache.org/jira/browse/DRILL-886 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Reporter: Aman Sinha > Assignee: Suresh Ollala > Priority: Critical > Fix For: 0.5.0 > > > The following query with a right outer join produces correct results on the > first execution in a session but wrong results on the second and subsequent > executions. A potential cause for the problem can be seen from the two > Explain plans: the scan of the nation table shows a difference in the > columns being projected. > 0: jdbc:drill:zk=local> select n.n_regionkey, r.r_regionkey from > cp.`tpch/region.parquet` r right join cp.`tpch/nation.parquet` n on > n.n_regionkey = r.r_regionkey; > +-------------+-------------+ > | n_regionkey | r_regionkey | > +-------------+-------------+ > | 0 | 0 | > | 0 | 0 | > | 0 | 0 | > | 0 | 0 | > | 0 | 0 | > | 1 | 1 | > | 1 | 1 | > | 1 | 1 | > | 1 | 1 | > | 1 | 1 | > | 2 | 2 | > | 2 | 2 | > | 2 | 2 | > | 2 | 2 | > | 2 | 2 | > | 3 | 3 | > | 3 | 3 | > | 3 | 3 | > | 3 | 3 | > | 3 | 3 | > | 4 | 4 | > | 4 | 4 | > | 4 | 4 | > | 4 | 4 | > | 4 | 4 | > +-------------+-------------+ > 25 rows selected (2.207 seconds) > 0: jdbc:drill:zk=local> select n.n_regionkey, r.r_regionkey from > cp.`tpch/region.parquet` r right join cp.`tpch/nation.parquet` n on > n.n_regionkey = r.r_regionkey; > +-------------+-------------+ > | n_regionkey | r_regionkey | > +-------------+-------------+ > | 0 | null | > | 1 | null | > | 1 | null | > | 1 | null | > | 4 | null | > | 0 | null | > | 3 | null | > | 3 | null | > | 2 | null | > | 2 | null | > | 4 | null | > | 4 | null | > | 2 | null | > | 4 | null | > | 0 | null | > | 0 | null | > | 0 | null | > | 1 | null | > | 2 | null | > | 3 | null | > | 4 | null | > | 2 | null | > | 3 | null | > | 3 | null | > | 1 | null | > +-------------+-------------+ > 25 rows selected (0.514 seconds) > EXPLAIN plan for the good run: > | 00-00 Screen > 00-01 Project(n_regionkey=[$0], r_regionkey=[$1]) > 00-02 Project(n_regionkey=[$3], r_regionkey=[$1]) > 00-03 HashJoin(condition=[=($3, $1)], joinType=[right]) > 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=/tpch/region.parquet]], selectionRoot=/tpch/region.parquet, > columns=[SchemaPath [`r_regionkey`]]]]) > 00-04 Project(*0=[$0], n_regionkey=[$1]) > 00-06 BroadcastExchange > 01-01 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=/tpch/nation.parquet]], > selectionRoot=/tpch/nation.parquet, columns=[SchemaPath [`n_regionkey`]]]]) > Explain plan for the bad run: > | 00-00 Screen > 00-01 Project(n_regionkey=[$0], r_regionkey=[$1]) > 00-02 Project(n_regionkey=[$3], r_regionkey=[$1]) > 00-03 HashJoin(condition=[=($2, $1)], joinType=[right]) > 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=/tpch/region.parquet]], selectionRoot=/tpch/region.parquet, > columns=[SchemaPath [`r_regionkey`]]]]) > 00-04 Project(*0=[$0], n_regionkey=[$1]) > 00-06 BroadcastExchange > 01-01 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=/tpch/nation.parquet]], > selectionRoot=/tpch/nation.parquet, columns=null]]) -- This message was sent by Atlassian JIRA (v6.2#6252)