venkata91 commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r657537534



##########
File path: core/src/main/scala/org/apache/spark/Dependency.scala
##########
@@ -148,6 +153,18 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C: 
ClassTag](
     }
   }
 
+  def resetShuffleMergeState(): Unit = {
+    _shuffleMergeEnabled = canShuffleMergeBeEnabled()
+    _shuffleMergedFinalized = false
+    mergerLocs = Nil

Review comment:
       Since all the other subsequent stages are yet to be run or resubmitted 
again, we would reuse the mergers once it is fetched again right? Also if we 
are resubmitting due to fetch failure, then we wouldn't know how many nodes (or 
mergers) lost in that case reusing them can cause more pushed blocks not get 
merged right? In that case, isn't it better to just get new set of mergers?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to