Steve Carlin created IMPALA-13587:
-------------------------------------

             Summary: Calcite planner: outer join not aggregating nulls properly
                 Key: IMPALA-13587
                 URL: https://issues.apache.org/jira/browse/IMPALA-13587
             Project: IMPALA
          Issue Type: Sub-task
            Reporter: Steve Carlin


The following query is producing incorrect results:

select t2.int_col y from alltypessmall t1 left outer join alltypestiny t2 on 
t1.int_col = t2.int_col group by 1


... due to nulls not being aggregated properly on multiple nodes.  This is 
because the value equivalency graph is being set for the join conjunct on an 
outer join. When a hash join partition node is being used, there is an 
optimization that skips the aggregation step that combines groups across nodes 
if, based on the value transfer graph, it deduces that all data for the 
partition column is being sent to the same node. 

The bug here is that even though an outer join is using an equi-conjunct, the 
left and right side are different when data is not found on the outer join 
side, where it becomse null.

The fix is to avoid registering the equi-conjunct if the values are not always 
equal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to