[
https://issues.apache.org/jira/browse/SPARK-51203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhen Wang updated SPARK-51203:
------------------------------
Description:
ForceOptimizeSkewedJoin allows optimizing skewed join even if
introduce extra shuffle, but currently it only works for aggregation after
join, not for aggregations in children of join. Like:
{code:java}
HashAggregate
|
Exchange
|
HashAggregate Exchange (skewed side)
| |
Sort Sort
\ /
SortMergeJoin{code}
When we enable ForceOptimizeSkewedJoin, can we introduce extra shuffle for join
child so as to optimize skewed join? Like:
{code:java}
HashAggregate
|
Exchange
|
HashAggregate
|
Exchange
(froce extra shuffle) Exchange (skewed side)
| |
Sort Sort
\ /
SortMergeJoin(isSkewJoin = true) {code}
was:
ForceOptimizeSkewedJoin allows optimizing skewed join even if
introduce extra shuffle, but currently it only works for aggregation after
join, not for aggregations in children of join. Like:
{code:java}
HashAggregate
|
Exchange
|
HashAggregate Exchange (skewed side)
| |
Sort Sort
\ /
SortMergeJoin(isSkewJoin = true) {code}
When we enable ForceOptimizeSkewedJoin, can we introduce extra shuffle for join
child so as to optimize skewed join? Like:
{code:java}
HashAggregate
|
Exchange
|
HashAggregate
|
Exchange
(froce extra shuffle) Exchange (skewed side)
| |
Sort Sort
\ /
SortMergeJoin(isSkewJoin = true) {code}
> ForceOptimizeSkewedJoin does not take effect for child aggregations in skewed
> join
> ----------------------------------------------------------------------------------
>
> Key: SPARK-51203
> URL: https://issues.apache.org/jira/browse/SPARK-51203
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.0.0
> Reporter: Zhen Wang
> Priority: Major
> Labels: pull-request-available
>
> ForceOptimizeSkewedJoin allows optimizing skewed join even if
> introduce extra shuffle, but currently it only works for aggregation after
> join, not for aggregations in children of join. Like:
> {code:java}
> HashAggregate
> |
> Exchange
> |
> HashAggregate Exchange (skewed side)
> | |
> Sort Sort
> \ /
> SortMergeJoin{code}
> When we enable ForceOptimizeSkewedJoin, can we introduce extra shuffle for
> join child so as to optimize skewed join? Like:
> {code:java}
> HashAggregate
> |
> Exchange
> |
> HashAggregate
> |
> Exchange
> (froce extra shuffle) Exchange (skewed side)
> | |
> Sort Sort
> \ /
> SortMergeJoin(isSkewJoin = true) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]