This is an automated email from the ASF dual-hosted git repository.

michaelsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 5b4427ed1beacb13522245a30d304aefbb7afb07
Author: Steve Carlin <[email protected]>
AuthorDate: Wed Nov 27 12:50:49 2024 -0800

    IMPALA-13587: Calcite planner: Outer join not aggregating nulls properly
    
    The following query is producing incorrect results:
    
    select t2.int_col y from alltypessmall t1 left outer join
    alltypestiny t2 on t1.int_col = t2.int_col group by 1
    
    ... due to nulls not being aggregated properly on multiple nodes.
    This is because the value equivalency graph is being set for the
    join conjunct on an outer join. When a hash join partition node is
    being used, there is an optimization that skips the aggregation step
    that combines groups across nodes if, based on the value transfer
    graph, it deduces that all data for the partition column is being
    sent to the same node.
    
    The bug here is that even though an outer join is using an
    equi-conjunct, the left and right side are different when data is not
    found on the outer join side, where it becomes null.
    
    The fix is to avoid registering the equi-conjunct if the values are
    not always equal.
    
    Change-Id: I57e9d4ad4c4af5a4c268e43ac2937064dab6ffd7
    Reviewed-on: http://gerrit.cloudera.org:8080/22138
    Reviewed-by: Michael Smith <[email protected]>
    Reviewed-by: Riza Suminto <[email protected]>
    Tested-by: Impala Public Jenkins <[email protected]>
    Reviewed-by: Steve Carlin <[email protected]>
---
 .../java/org/apache/impala/calcite/rel/node/ImpalaJoinRel.java | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git 
a/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaJoinRel.java
 
b/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaJoinRel.java
index 4903c64a0..4d32de0ea 100644
--- 
a/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaJoinRel.java
+++ 
b/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaJoinRel.java
@@ -143,13 +143,15 @@ public class ImpalaJoinRel extends Join
           rightInput.planNode_, false /* not a straight join */, distMode, 
joinOp,
           equiJoinConjuncts, otherJoinConjuncts, filterConjuncts, analyzer);
 
-    if (equiJoinConjuncts.size() > 0) {
-      // register the equi and non-equi join conjuncts with the analyzer such 
that
-      // value transfer graph creation can consume it
+    // register the equi join conjuncts with the analyzer such that
+    // value transfer graph creation can consume it. It is only useful
+    // in the value transfer graph if the value transfer is equal on
+    // both sides. Any outer join is removed since the value on the outer
+    // join side could be NULL when the left side is not NULL.
+    if (equiJoinConjuncts.size() > 0 && joinOp == JoinOperator.INNER_JOIN) {
       List<Expr> equiJoinExprs = new ArrayList<Expr>(equiJoinConjuncts);
       analyzer.registerConjuncts(getJoinConjunctListToRegister(equiJoinExprs));
     }
-    
analyzer.registerConjuncts(getJoinConjunctListToRegister(otherJoinConjuncts));
 
     return new NodeWithExprs(joinNode, outputExprs);
   }

Reply via email to