Re: [PR] [SPARK-56145][SQL] Enforce minimum partition count for join stages in CoalesceShufflePartitions [spark]

via GitHub Thu, 04 Jun 2026 00:32:19 -0700


shrirangmhalgi commented on code in PR #56243:
URL: https://github.com/apache/spark/pull/56243#discussion_r3354266459



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CoalesceShufflePartitions.scala:
##########
@@ -171,7 +200,8 @@ case class CoalesceShufflePartitions(session: SparkSession) 
extends AQEShuffleRe
       if (shuffleStages.forall(s => isSupported(s.shuffleStage.shuffle))) {
         // The recursion stops here, we need to call 
`p.exists(isExplodingJoin)` and find out if
         // there is any exploding join in this sub-plan-tree.
-        Seq(CoalesceGroup(shuffleStages, hasExplodingJoin || 
p.exists(isExplodingJoin)))
+        Seq(CoalesceGroup(shuffleStages, hasExplodingJoin || 
p.exists(isExplodingJoin),

Review Comment:
   Minor: Would it help readability to add a short comment explaining when 
`p.exists(isPartitionedJoin`) is needed vs `isPartitionedJoin(p)` hitting 
directly? From tracing the recursion, `p` appears to always be the join node 
itself (since `collectCoalesceGroups` collects shuffles from all descendants 
and attributes the group to the topmost consumer). If the `exists` is a safety 
net for plan shapes I'm not considering, the comment would save future readers 
the same trace.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-56145][SQL] Enforce minimum partition count for join stages in CoalesceShufflePartitions [spark]

Reply via email to