venkata91 commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r657537534
##########
File path: core/src/main/scala/org/apache/spark/Dependency.scala
##########
@@ -148,6 +153,18 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C:
ClassTag](
}
}
+ def resetShuffleMergeState(): Unit = {
+ _shuffleMergeEnabled = canShuffleMergeBeEnabled()
+ _shuffleMergedFinalized = false
+ mergerLocs = Nil
Review comment:
Since all the other subsequent stages are yet to be run or resubmitted
again, we would reuse the mergers once it is fetched again right? Also if we
are resubmitting due to fetch failure, then we wouldn't know how many nodes (or
mergers) lost in that case reusing them can cause more pushed blocks not get
merged right? In that case, isn't it better to just get new set of mergers?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]