[
https://issues.apache.org/jira/browse/IMPALA-14685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zoltán Borók-Nagy resolved IMPALA-14685.
----------------------------------------
Fix Version/s: Impala 5.0.0
Resolution: Fixed
> Incorrect OUTER JOIN to INNER JOIN conversion with NOT(... AND ...) predicate
> when enable_outer_join_to_inner_transformation=1
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-14685
> URL: https://issues.apache.org/jira/browse/IMPALA-14685
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.5.0, Impala 4.4.1
> Reporter: Dmitriy Maslov
> Assignee: Zoltán Borók-Nagy
> Priority: Major
> Fix For: Impala 5.0.0
>
>
> When enable_outer_join_to_inner_transformation=1 is enabled, queries
> containing a OUTER JOIN with a WHERE clause using `NOT(t1.column = value AND
> t2.column = value)` predicate are incorrectly transformed into INNER JOIN,
> resulting in missing rows that should be included in the result set.
> h3. Reproduction Steps
> {code:sql}
> -- Clean up if tables exist
> DROP TABLE IF EXISTS users;
> DROP TABLE IF EXISTS subscriptions;
> -- Create test tables
> CREATE TABLE users (
> user_id INT,
> status STRING
> );
> CREATE TABLE subscriptions (
> subscription_id INT,
> user_id INT,
> plan_type STRING
> );
> -- Insert test data
> INSERT INTO users VALUES
> (1, 'active'),
> (2, 'inactive'),
> (3, 'active'),
> (4, 'inactive');
> INSERT INTO subscriptions VALUES
> (1, 1, 'premium'), -- User 1: premium plan
> (2, 2, 'basic'); -- User 2: basic plan
> -- ========================================
> -- Demonstrate the bug
> -- ========================================
> SET enable_outer_join_to_inner_transformation=0;
> SELECT *
> FROM users u
> LEFT JOIN subscriptions s ON u.user_id = s.user_id
> WHERE NOT(u.status = 'active' AND s.plan_type = 'premium');
> -- Correct result
> -- +---------+----------+-----------------+---------+-----------+
> -- | user_id | status | subscription_id | user_id | plan_type |
> -- +---------+----------+-----------------+---------+-----------+
> -- | 2 | inactive | 2 | 2 | basic |
> -- | 4 | inactive | NULL | NULL | NULL |
> -- +---------+----------+-----------------+---------+-----------+
> SET enable_outer_join_to_inner_transformation=1;
> SELECT *
> FROM users u
> LEFT JOIN subscriptions s ON u.user_id = s.user_id
> WHERE NOT(u.status = 'active' AND s.plan_type = 'premium');
> -- Incorrect when enable_outer_join_to_inner_transformation=1;
> -- +---------+----------+-----------------+---------+-----------+
> -- | user_id | status | subscription_id | user_id | plan_type |
> -- +---------+----------+-----------------+---------+-----------+
> -- | 2 | inactive | 2 | 2 | basic |
> -- +---------+----------+-----------------+---------+-----------+ {code}
> h3. Proposed Fix
> File: fe/src/main/java/org/apache/impala/analysis/Analyzer.java
> Method: private boolean isNullableConjunct(Expr e, List<TupleId> tupleIds)
> Change: Add the following line at the beginning of the method to push
> negation down to operands before analyzing the expression:
> {code:java}
> e = Expr.pushNegationToOperands(e); {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)