Zhen Wang created SPARK-51203:
---------------------------------
Summary: ForceOptimizeSkewedJoin does not take effect for child
aggregations in skewed join
Key: SPARK-51203
URL: https://issues.apache.org/jira/browse/SPARK-51203
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 4.0.0
Reporter: Zhen Wang
ForceOptimizeSkewedJoin allows optimizing skewed join even if
introduce extra shuffle, but currently it only works for aggregation after
join, not for aggregations in children of join. Like:
{code:java}
HashAggregate
|
Exchange
|
HashAggregate Exchange (skewed side)
| |
Sort Sort
\ /
SortMergeJoin(isSkewJoin = true) {code}
When we enable ForceOptimizeSkewedJoin, can we introduce extra shuffle for join
child so as to optimize skewed join? Like:
{code:java}
HashAggregate
|
Exchange
|
HashAggregate
|
Exchange
(froce extra shuffle) Exchange (skewed side)
| |
Sort Sort
\ /
SortMergeJoin(isSkewJoin = true) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]