[
https://issues.apache.org/jira/browse/IMPALA-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manish Maheshwari updated IMPALA-13261:
---------------------------------------
Description:
Currently NULL keys are hashed to a single value and sent to a single fragment
instance in partitioned joins. This can cause data skew if the number of NULL
keys is large.
The planner could give preference to BROADCAST in LEFT OUTER JOIN when the
number of NULLs is large on the probe side.
Another potential solution for the same problem is IMPALA-13260 - it is about
sending rows with NULL keys to local fragment instances in this situation.
was:
Currently NULL keys are hashed to a single value and sent to a single fragment
instance in partitioned joins. This can cause data skew if the number of NULL
keys is large.
The planner could give preference to BROADCAST in LEFT OUTER JOIN when the
number of NULLs is large on the probe side.
Another potential solution for the same problem is IMPALA-13260 - it is about
sending rows with NULL keys to local fragment instances in this situation.
> Consider the effect of NULL keys when choosing BROADCAST vs SHUFFLE join
> ------------------------------------------------------------------------
>
> Key: IMPALA-13261
> URL: https://issues.apache.org/jira/browse/IMPALA-13261
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Csaba Ringhofer
> Priority: Major
>
> Currently NULL keys are hashed to a single value and sent to a single
> fragment instance in partitioned joins. This can cause data skew if the
> number of NULL keys is large.
> The planner could give preference to BROADCAST in LEFT OUTER JOIN when the
> number of NULLs is large on the probe side.
> Another potential solution for the same problem is IMPALA-13260 - it is about
> sending rows with NULL keys to local fragment instances in this situation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]