mridulm commented on a change in pull request #34122:
URL: https://github.com/apache/spark/pull/34122#discussion_r785388090



##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -1176,6 +1223,9 @@ private[spark] class MapOutputTrackerWorker(conf: 
SparkConf) extends MapOutputTr
   // instantiate a serializer. See the followup to SPARK-36705 for more 
details.
   private lazy val fetchMergeResult = Utils.isPushBasedShuffleEnabled(conf, 
isDriver = false)
 
+  // Exposed for testing
+  val shufflePushMergerLocations = new ConcurrentHashMap[Int, 
Seq[BlockManagerId]]().asScala

Review comment:
       This needs to account for shuffle merge id as well - to ensure the 
merger locations are consistent.
   Possible flow is:
   a) attempt 0 has enough mergers, and so executors cache value for attempt 0 
mergers.
   b) one or more hosts gets added to exclude list before attempt 1).
   c) attempt 1 starts, with merge allowed = true, but merge enabled = false - 
so tasks start without mergers populated in dependency (merge might get enabled 
as more executors are added).
   d) executors which ran attempt 0 will use mergers from previous attempt - 
new executors will use mergers from attempt 1.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to