[ https://issues.apache.org/jira/browse/IMPALA-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184785#comment-17184785 ]
ASF subversion and git services commented on IMPALA-7782: --------------------------------------------------------- Commit e133d1838ab05e75007fef24e2ce1b6f18113c8d in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=e133d18 ] IMPALA-7782: fix constant NOT IN subqueries that can return 0 rows The bug was the the statement rewriter converted NOT IN <subquery> predicates to != <subquery> predicates when the subquery could be an empty set. This was invalid, because NOT IN (<empty set>) is true, but != (<empty set>) is false. Testing: Added targeted planner and end-to-end tests. Ran exhaustive tests. Change-Id: I66c726f0f66ce2f609e6ba44057191f5929a67fc Reviewed-on: http://gerrit.cloudera.org:8080/16338 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > discrepancy in results with a subquery containing an agg that produces an > empty set > ----------------------------------------------------------------------------------- > > Key: IMPALA-7782 > URL: https://issues.apache.org/jira/browse/IMPALA-7782 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 2.12.0, Impala 3.1.0 > Reporter: Michael Brown > Assignee: Tim Armstrong > Priority: Major > Labels: correctness, query_generator, ramp-up > > A discrepancy exists between Impala and Postgres when a subquery contains an > agg and results in an empty set, yet the WHERE clause looking at the subquery > should produce a "True" condition. > Example queries include: > {noformat} > USE functional; > SELECT id > FROM alltypestiny > WHERE -1 NOT IN (SELECT COUNT(id) FROM alltypestiny HAVING false); > SELECT id > FROM alltypestiny > WHERE NULL NOT IN (SELECT COUNT(id) FROM alltypestiny HAVING false); > SELECT id > FROM alltypestiny > WHERE (SELECT COUNT(id) FROM alltypestiny HAVING false) IS NULL; > {noformat} > These queries do not produce any rows in Impala. In Postgres, the queries > produce all 8 rows for the functional.alltypestiny id column. > Thinking maybe there were Impala and Postgres differences with {{NOT IN}} > behavior, I also tried this: > {noformat} > USE functional; > SELECT id > FROM alltypestiny > WHERE -1 NOT IN (SELECT 1 FROM alltypestiny WHERE bool_col IS NULL); > {noformat} > This subquery also produces an empty set just like the subquery in the > problematic queries at the top, but unlike those queries, this full query > returns the same results in Impala and Postgres (all 8 rows for the > functional.alltypestiny id column). > For anyone interested in this bug, you can migrate data into postgres in a > dev environment using > {noformat} > tests/comparison/data_generator.py --use-postgresql --migrate-table-names > alltypestiny --db-name functional migrate > {noformat} > This is in 2.12 at least, so it's not a 3.1 regression. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org