[ 
https://issues.apache.org/jira/browse/SPARK-55461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-55461:
-----------------------------------
    Labels: pull-request-available  (was: )

> Improve AQE Coalesce Grouping warning messages when numOfPartitions of 
> ShuffleStages in the same coalesce group are not equal
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-55461
>                 URL: https://issues.apache.org/jira/browse/SPARK-55461
>             Project: Spark
>          Issue Type: Task
>          Components: SQL
>    Affects Versions: 4.2.0
>            Reporter: Eren Avsarogullari
>            Priority: Major
>              Labels: pull-request-available
>
> Spark Adaptive Query Execution(AQE) framework coalesces the eligible shuffle 
> partitions (e.g: empty/small-sized shuffle partitions) during the query 
> execution. This optimization requires to be created coalesce groups for 
> related Shuffle Stages (e.g: SortMergeJoin can have 1 ShuffleQueryStage per 
> join-leg) to guarantee that both SMJ legs having the same num of partitions 
> before join execution. To create coalesce groups for related ShuffleStages, 
> Spark Plan Tree needs to be traversed by finding ShuffleQueryStages. 
> SPARK-46590 has fixed incorrect coalesce grouping problem by adding 
> BinaryExecNode Support for SparkPlan Tree traversal. This PR aims to 
> introduce following complementary improvements as follow-up to SPARK-46590:
> *1-* Adding warning log message to 
> ShufflePartitionsUtil.coalescePartitionsWithoutSkew() when numOfPartitions of 
> ShuffleStages in the same coalesce group are not equal. This is required for 
> the consistency because ShufflePartitionsUtil.coalescePartitionsWithSkew() 
> logs warning message for the same case,
> *2-* Adding problematic shuffleStageIds to warning messages when 
> numOfPartitions of ShuffleStages in the same coalesce group are not equal. 
> This info can help for troubleshooting.
> *3-* Aligning the warning logs for specially for both 
> ShufflePartitionsUtil.coalescePartitionsWithoutSkew() and 
> coalescePartitionsWithSkew() cases
> *4-* 2 new UT cases are being added:
> Current UT Cases cover following use cases and 
> {code:java}
> skewed SMJ under Union under BNLJ,
> skewed SMJ under Union under CartesianProduct{code}
> This PR also adds following new UT cases:
> {code:java}
> 4.1- skewed SMJ under Union under BHJ,
> 4.2- non-skewed SMJ under Union under BHJ
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to