[ 
https://issues.apache.org/jira/browse/SPARK-55461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eren Avsarogullari updated SPARK-55461:
---------------------------------------
    Description: 
Spark Adaptive Query Execution(AQE) framework coalesces the eligible shuffle 
partitions (e.g: empty/small-sized shuffle partitions) during the query 
execution. This optimization requires to be created coalesce groups for related 
Shuffle Stages (e.g: SortMergeJoin can have 1 ShuffleQueryStage per join-leg) 
to guarantee that both SMJ legs having the same num of partitions before join 
execution. To create coalesce groups for related ShuffleStages, Spark Plan Tree 
needs to be traversed by finding ShuffleQueryStages. SPARK-46590 has fixed 
incorrect coalesce grouping problem by adding BinaryExecNode Support for 
SparkPlan Tree traversal. This PR aims to introduce following complementary 
improvements as follow-up to SPARK-46590:

*1-* Adding warning log message to 
ShufflePartitionsUtil.coalescePartitionsWithoutSkew() when numOfPartitions of 
ShuffleStages in the same coalesce group are not equal. This is required for 
the consistency because ShufflePartitionsUtil.coalescePartitionsWithSkew() logs 
warning message for the same case,

*2-* Adding problematic shuffleStageIds to warning messages when 
numOfPartitions of ShuffleStages in the same coalesce group are not equal. This 
info can help for troubleshooting.

*3-* Aligning the warning logs for specially for both 
ShufflePartitionsUtil.coalescePartitionsWithoutSkew() and 
coalescePartitionsWithSkew() cases

*4-* 2 new UT cases are being added:
Current UT Cases cover following use cases and 
{code:java}
skewed SMJ under Union under BNLJ,
skewed SMJ under Union under CartesianProduct{code}
This PR also adds following new UT cases:
{code:java}
4.1- skewed SMJ under Union under BHJ,
4.2- non-skewed SMJ under Union under BHJ
{code}

  was:
Spark Adaptive Query Execution(AQE) framework coalesces the eligible shuffle 
partitions (e.g: empty/small-sized shuffle partitions) during the query 
execution. This optimization requires to be created coalesce groups for related 
Shuffle Stages (e.g: SortMergeJoin can have 1 ShuffleQueryStage per join-leg) 
to guarantee that both SMJ legs having the same num of partitions before join 
execution. To create coalesce groups for related ShuffleStages, Spark Plan Tree 
needs to be traversed by finding ShuffleQueryStages. 
https://issues.apache.org/jira/browse/SPARK-46590 has fixed incorrect coalesce 
grouping problem by adding BinaryExecNode Support for SparkPlan Tree traversal. 
This PR aims to introduce following complementary improvements as follow-up to 
SPARK-46590:

*1-* Adding warning log message to 
ShufflePartitionsUtil.coalescePartitionsWithoutSkew() when numOfPartitions of 
ShuffleStages in the same coalesce group are not equal. This is required for 
the consistency because ShufflePartitionsUtil.coalescePartitionsWithSkew() logs 
warning message for the same case,

*2-* Adding problematic shuffleStageIds to warning messages when 
numOfPartitions of ShuffleStages in the same coalesce group are not equal. This 
info can help for troubleshooting.

*3-* Aligning the warning logs for specially for both 
ShufflePartitionsUtil.coalescePartitionsWithoutSkew() and 
coalescePartitionsWithSkew() cases

*4-* 2 new UT cases are being added:
Current UT Cases cover following use cases and 
{code:java}
skewed SMJ under Union under BNLJ,
skewed SMJ under Union under CartesianProduct{code}
This PR also adds following new UT cases:
{code:java}
4.1- skewed SMJ under Union under BHJ,
4.2- non-skewed SMJ under Union under BHJ
{code}


> Improve AQE Coalesce Grouping warning messages when numOfPartitions of 
> ShuffleStages in the same coalesce group are not equal
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-55461
>                 URL: https://issues.apache.org/jira/browse/SPARK-55461
>             Project: Spark
>          Issue Type: Task
>          Components: SQL
>    Affects Versions: 4.2.0
>            Reporter: Eren Avsarogullari
>            Priority: Major
>
> Spark Adaptive Query Execution(AQE) framework coalesces the eligible shuffle 
> partitions (e.g: empty/small-sized shuffle partitions) during the query 
> execution. This optimization requires to be created coalesce groups for 
> related Shuffle Stages (e.g: SortMergeJoin can have 1 ShuffleQueryStage per 
> join-leg) to guarantee that both SMJ legs having the same num of partitions 
> before join execution. To create coalesce groups for related ShuffleStages, 
> Spark Plan Tree needs to be traversed by finding ShuffleQueryStages. 
> SPARK-46590 has fixed incorrect coalesce grouping problem by adding 
> BinaryExecNode Support for SparkPlan Tree traversal. This PR aims to 
> introduce following complementary improvements as follow-up to SPARK-46590:
> *1-* Adding warning log message to 
> ShufflePartitionsUtil.coalescePartitionsWithoutSkew() when numOfPartitions of 
> ShuffleStages in the same coalesce group are not equal. This is required for 
> the consistency because ShufflePartitionsUtil.coalescePartitionsWithSkew() 
> logs warning message for the same case,
> *2-* Adding problematic shuffleStageIds to warning messages when 
> numOfPartitions of ShuffleStages in the same coalesce group are not equal. 
> This info can help for troubleshooting.
> *3-* Aligning the warning logs for specially for both 
> ShufflePartitionsUtil.coalescePartitionsWithoutSkew() and 
> coalescePartitionsWithSkew() cases
> *4-* 2 new UT cases are being added:
> Current UT Cases cover following use cases and 
> {code:java}
> skewed SMJ under Union under BNLJ,
> skewed SMJ under Union under CartesianProduct{code}
> This PR also adds following new UT cases:
> {code:java}
> 4.1- skewed SMJ under Union under BHJ,
> 4.2- non-skewed SMJ under Union under BHJ
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to