Csaba Ringhofer created IMPALA-13261:
----------------------------------------

             Summary: Consider the effect of NULL keys when choosing BROADCAST 
vs SHUFFLE join
                 Key: IMPALA-13261
                 URL: https://issues.apache.org/jira/browse/IMPALA-13261
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Csaba Ringhofer


Currently NULL keys are hashed to a single value and sent to a single fragment 
instance in partitioned joins. This can cause data skew if the number of NULL 
keys is large.

The planner could give preference to BROADCAST in LEFT OUTER JOIN when the 
number of NULLs is large on the probe side. 

Another potential solution for the same problem is IMPALA-13260 - it is about 
sending rows with NULL keys to local fragment instances in this situation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to