[
https://issues.apache.org/jira/browse/SPARK-47284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-47284:
----------------------------------
Target Version/s: (was: 4.0.0)
> We should ensure enough parallelism when ShuffleExchangeLike join with specs
> without shuffle
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-47284
> URL: https://issues.apache.org/jira/browse/SPARK-47284
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.0.0
> Reporter: Qi Zhu
> Priority: Major
>
> The following case is introduced by
> https://issues.apache.org/jira/browse/SPARK-35703
> // When choosing specs, we should consider those children with no
> `ShuffleExchangeLike` node
> // first. For instance, if we have:
> // A: (No_Exchange, 100) <---> B: (Exchange, 120)
> // it's better to pick A and change B to (Exchange, 100) instead of picking B
> and insert a
> // new shuffle for A.
> *But we'd better improve it in some cases, for example:*
> A: (No_Exchange, 2) <---> B: (Exchange, 100)
> The current logic will change to:
> A: (No_Exchange, 2) <---> B: (Exchange,2)
> It actually not ensure enough parallelism, it will reduce the performance i
> think.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]