[ 
https://issues.apache.org/jira/browse/IMPALA-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184785#comment-17184785
 ] 

ASF subversion and git services commented on IMPALA-7782:
---------------------------------------------------------

Commit e133d1838ab05e75007fef24e2ce1b6f18113c8d in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e133d18 ]

IMPALA-7782: fix constant NOT IN subqueries that can return 0 rows

The bug was the the statement rewriter converted NOT IN <subquery>
predicates to != <subquery> predicates when the subquery could
be an empty set. This was invalid, because NOT IN (<empty set>)
is true, but != (<empty set>) is false.

Testing:
Added targeted planner and end-to-end tests.

Ran exhaustive tests.

Change-Id: I66c726f0f66ce2f609e6ba44057191f5929a67fc
Reviewed-on: http://gerrit.cloudera.org:8080/16338
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> discrepancy in results with a subquery containing an agg that produces an 
> empty set
> -----------------------------------------------------------------------------------
>
>                 Key: IMPALA-7782
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7782
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.12.0, Impala 3.1.0
>            Reporter: Michael Brown
>            Assignee: Tim Armstrong
>            Priority: Major
>              Labels: correctness, query_generator, ramp-up
>
> A discrepancy exists between Impala and Postgres when a subquery contains an 
> agg and results in an empty set, yet the WHERE clause looking at the subquery 
> should produce a "True" condition.
> Example queries include:
> {noformat}
> USE functional;
> SELECT id
> FROM alltypestiny
> WHERE -1 NOT IN (SELECT COUNT(id) FROM alltypestiny HAVING false);
> SELECT id
> FROM alltypestiny
> WHERE NULL NOT IN (SELECT COUNT(id) FROM alltypestiny HAVING false);
> SELECT id
> FROM alltypestiny
> WHERE (SELECT COUNT(id) FROM alltypestiny HAVING false) IS NULL;
> {noformat}
> These queries do not produce any rows in Impala. In Postgres, the queries 
> produce all 8 rows for the functional.alltypestiny id column.
> Thinking maybe there were Impala and Postgres differences with {{NOT IN}} 
> behavior, I also tried this:
> {noformat}
> USE functional;
> SELECT id
> FROM alltypestiny
> WHERE -1 NOT IN (SELECT 1 FROM alltypestiny WHERE bool_col IS NULL);
> {noformat}
> This subquery also produces an empty set just like the subquery in the 
> problematic queries at the top, but unlike those queries, this full query 
> returns the same results in Impala and Postgres (all 8 rows for the 
> functional.alltypestiny id column).
> For anyone interested in this bug, you can migrate data into postgres in a 
> dev environment using
> {noformat}
> tests/comparison/data_generator.py --use-postgresql --migrate-table-names 
> alltypestiny --db-name functional migrate
> {noformat}
> This is in 2.12 at least, so it's not a 3.1 regression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to