Victsm commented on a change in pull request #30480:
URL: https://github.com/apache/spark/pull/30480#discussion_r534480662



##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -524,10 +669,37 @@ private[spark] class MapOutputTrackerMaster(
     }
   }
 
+  def registerMergeResult(shuffleId: Int, reduceId: Int, status: MergeStatus) {
+    shuffleStatuses(shuffleId).addMergeResult(reduceId, status)
+  }
+
+  def registerMergeResults(shuffleId: Int, statuses: Seq[(Int, MergeStatus)]): 
Unit = {
+    statuses.foreach {
+      case (reduceId, status) => registerMergeResult(shuffleId, reduceId, 
status)
+    }
+  }
+
+  def unregisterMergeResult(shuffleId: Int, reduceId: Int, bmAddress: 
BlockManagerId) {

Review comment:
       Want to clarify further here.
   Even with SPARK-32920, we are not currently using this API.
   This is because we would fallback to fetching the unmerged original shuffle 
blocks if encountering any fetch failures on merged shuffle files.
   This would hide all such fetch failures from the DAGScheduler, thus not 
invoking this unregisterMergeResult API.
   This is a potential area for future improvement, however it won't lead to 
any potential data duplication issues.
   Note that, when we finalize a shuffle merge, where the driver has already 
received the MergeStatus for a given shuffle, we won't enable push for 
following retries of the shuffle map stage sharing the same shuffle dependency 
(#30164).
   This will make sure that the MergeStatus always track the correct 
information about the merged shuffle partitions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to