[
https://issues.apache.org/jira/browse/SPARK-55461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-55461:
-----------------------------------
Labels: pull-request-available (was: )
> Improve AQE Coalesce Grouping warning messages when numOfPartitions of
> ShuffleStages in the same coalesce group are not equal
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-55461
> URL: https://issues.apache.org/jira/browse/SPARK-55461
> Project: Spark
> Issue Type: Task
> Components: SQL
> Affects Versions: 4.2.0
> Reporter: Eren Avsarogullari
> Priority: Major
> Labels: pull-request-available
>
> Spark Adaptive Query Execution(AQE) framework coalesces the eligible shuffle
> partitions (e.g: empty/small-sized shuffle partitions) during the query
> execution. This optimization requires to be created coalesce groups for
> related Shuffle Stages (e.g: SortMergeJoin can have 1 ShuffleQueryStage per
> join-leg) to guarantee that both SMJ legs having the same num of partitions
> before join execution. To create coalesce groups for related ShuffleStages,
> Spark Plan Tree needs to be traversed by finding ShuffleQueryStages.
> SPARK-46590 has fixed incorrect coalesce grouping problem by adding
> BinaryExecNode Support for SparkPlan Tree traversal. This PR aims to
> introduce following complementary improvements as follow-up to SPARK-46590:
> *1-* Adding warning log message to
> ShufflePartitionsUtil.coalescePartitionsWithoutSkew() when numOfPartitions of
> ShuffleStages in the same coalesce group are not equal. This is required for
> the consistency because ShufflePartitionsUtil.coalescePartitionsWithSkew()
> logs warning message for the same case,
> *2-* Adding problematic shuffleStageIds to warning messages when
> numOfPartitions of ShuffleStages in the same coalesce group are not equal.
> This info can help for troubleshooting.
> *3-* Aligning the warning logs for specially for both
> ShufflePartitionsUtil.coalescePartitionsWithoutSkew() and
> coalescePartitionsWithSkew() cases
> *4-* 2 new UT cases are being added:
> Current UT Cases cover following use cases andÂ
> {code:java}
> skewed SMJ under Union under BNLJ,
> skewed SMJ under Union under CartesianProduct{code}
> This PR also adds following new UT cases:
> {code:java}
> 4.1- skewed SMJ under Union under BHJ,
> 4.2- non-skewed SMJ under Union under BHJ
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]