venkata91 commented on a change in pull request #34122:
URL: https://github.com/apache/spark/pull/34122#discussion_r785245294
##########
File path: core/src/main/scala/org/apache/spark/Dependency.scala
##########
@@ -144,12 +144,16 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C:
ClassTag](
_shuffleMergedFinalized = true
}
+ def shuffleMergeFinalized: Boolean = {
Review comment:
I refactored this in to `shuffleMergeAllowed` and `shuffleMergeEnabled`
to have a clear distinction where `shuffleMergeAllowed` controls the static
level of knobs like the following:
1. `Is RDD barrier?`
2. `Can Push shuffle be enabled?`
3. `Disabling push shuffle for retry once the determinate stage attempt is
finalized` etc.
and `shuffleMergeEnabled` will be checked only when `shuffleMergeAllowed` is
true along with that if sufficient mergers are available then it becomes `true`.
Given all the above, we still require separate methods one for
`isShuffleMergeFinalized` only checking for the `shuffleMergedFinalized` value
and `numPartitions > 0`. In the existing code, before the `ShuffleMapStage`
starts we are checking `if (!shuffleMergeFinalized)` then only we are calling
`prepareShuffleServicesForShuffleMapStage` but it is possible the previous
stage never had `shuffleMergeEnabled` due to not enough mergers therefore even
the retry also not be shuffle merge enabled, ideally this can be shuffle merge
enabled if enough mergers are available. But for proceeding with the next
stage, we need to check for `isShuffleMergeFinalizedIfEnabled` which is
checking `if (shuffleMergeEnabled) isShuffleMergeFinalized else true`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]