venkata91 commented on a change in pull request #30691:
URL: https://github.com/apache/spark/pull/30691#discussion_r647868622
##########
File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
##########
@@ -1739,18 +1756,24 @@ private[spark] class DAGScheduler(
if (mapStage.rdd.isBarrier()) {
// Mark all the map as broken in the map stage, to ensure retry
all the tasks on
// resubmitted stage attempt.
- mapOutputTracker.unregisterAllMapOutput(shuffleId)
+ mapOutputTracker.unregisterAllMapAndMergeOutput(shuffleId)
} else if (mapIndex != -1) {
// Mark the map whose fetch failed as broken in the map stage
mapOutputTracker.unregisterMapOutput(shuffleId, mapIndex,
bmAddress)
+ if (mapStage.shuffleDep.shuffleMergeEnabled) {
Review comment:
Instead of checking for shuffle dependency flag which can be set to
false later only checking for the global `pushBasedShuffleEnabled` variable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]