Eren Avsarogullari created SPARK-55461:
------------------------------------------
Summary: Improve AQE Coalesce Grouping warning messages when
numOfPartitions of ShuffleStages in the same coalesce group are not equal
Key: SPARK-55461
URL: https://issues.apache.org/jira/browse/SPARK-55461
Project: Spark
Issue Type: Task
Components: SQL
Affects Versions: 4.2.0
Reporter: Eren Avsarogullari
Spark Adaptive Query Execution(AQE) framework coalesces the eligible shuffle
partitions (e.g: empty/small-sized shuffle partitions) during the query
execution. This optimization requires to be created coalesce groups for related
Shuffle Stages (e.g: SortMergeJoin can have 1 ShuffleQueryStage per join-leg)
to guarantee that both SMJ legs having the same num of partitions before join
execution. To create coalesce groups for related ShuffleStages, Spark Plan Tree
needs to be traversed by finding ShuffleQueryStages.
https://issues.apache.org/jira/browse/SPARK-46590 has fixed incorrect coalesce
grouping problem by adding BinaryExecNode Support for SparkPlan Tree traversal.
This PR aims to introduce following complementary improvements as follow-up to
SPARK-46590:
*1-* Adding warning log message to
ShufflePartitionsUtil.coalescePartitionsWithoutSkew() when numOfPartitions of
ShuffleStages in the same coalesce group are not equal. This is required for
the consistency because ShufflePartitionsUtil.coalescePartitionsWithSkew() logs
warning message for the same case,
*2-* Adding problematic shuffleStageIds to warning messages when
numOfPartitions of ShuffleStages in the same coalesce group are not equal. This
info can help for troubleshooting.
*3-* Aligning the warning logs for specially for both
ShufflePartitionsUtil.coalescePartitionsWithoutSkew() and
coalescePartitionsWithSkew() cases
*4-* 2 new UT cases are being added:
Current UT Cases cover following use cases andÂ
{code:java}
skewed SMJ under Union under BNLJ,
skewed SMJ under Union under CartesianProduct{code}
This PR also adds following new UT cases:
{code:java}
4.1- skewed SMJ under Union under BHJ,
4.2- non-skewed SMJ under Union under BHJ
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]