Jinfeng Ni created DRILL-5681: --------------------------------- Summary: Incorrect query result when query uses star and correlated subquery Key: DRILL-5681 URL: https://issues.apache.org/jira/browse/DRILL-5681 Project: Apache Drill Issue Type: Bug Reporter: Jinfeng Ni
The following repo was based on a testcase provided by Arjun Rajan(ara...@mapr.com). Drill returns incorrect query result, when the query has a correlated subquery and querying against a view defined with select *, or querying a subquery with select *. Case 1: Querying view with select * + correlated subquery {code} create view dfs.tmp.region_view as select * from cp.`tpch/region.parquet`; {code} //Q1 : return 25 rows. The correct answer is 0 row. {code} SELECT n_nationkey, n_name FROM dfs.tmp.nation_view a WHERE NOT EXISTS (SELECT 1 FROM cp.`tpch/region.parquet` b WHERE b.r_regionkey = a.n_regionkey ) +--------------+-----------------+ | n_nationkey | n_name | +--------------+-----------------+ | 0 | ALGERIA | | 1 | ARGENTINA | | 2 | BRAZIL | ... | 24 | UNITED STATES | +--------------+-----------------+ 25 rows selected (0.614 seconds) {code} // Q2: return 0 row. The correct answer is 25 rows. {code} SELECT n_nationkey, n_name FROM dfs.tmp.nation_view a WHERE EXISTS (SELECT 1 FROM cp.`tpch/region.parquet` b WHERE b.r_regionkey = a.n_regionkey ) +--------------+---------+ | n_nationkey | n_name | +--------------+---------+ +--------------+---------+ No rows selected (0.4 seconds) {code} Case 2: Querying a table expression with select * // Q3: return 25 rows. The correct result is 0 row {code} SELECT n_nationkey, n_name FROM ( SELECT * FROM cp.`tpch/nation.parquet` ) a WHERE NOT EXISTS (SELECT 1 FROM cp.`tpch/region.parquet` b WHERE b.r_regionkey = a.n_regionkey ) +--------------+-----------------+ | n_nationkey | n_name | +--------------+-----------------+ | 0 | ALGERIA | | 1 | ARGENTINA | ... | 24 | UNITED STATES | +--------------+-----------------+ 25 rows selected (0.451 seconds) {code} Q4: return 0 row. The correct result is 25 rows. {code} SELECT n_nationkey, n_name FROM ( SELECT * FROM cp.`tpch/nation.parquet` ) a WHERE EXISTS (SELECT 1 FROM cp.`tpch/region.parquet` b WHERE b.r_regionkey = a.n_regionkey ) +--------------+---------+ | n_nationkey | n_name | +--------------+---------+ +--------------+---------+ No rows selected (0.515 seconds) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)