venkata91 commented on a change in pull request #34122:
URL: https://github.com/apache/spark/pull/34122#discussion_r788286262
##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -1176,6 +1223,9 @@ private[spark] class MapOutputTrackerWorker(conf:
SparkConf) extends MapOutputTr
// instantiate a serializer. See the followup to SPARK-36705 for more
details.
private lazy val fetchMergeResult = Utils.isPushBasedShuffleEnabled(conf,
isDriver = false)
+ // Exposed for testing
+ val shufflePushMergerLocations = new ConcurrentHashMap[Int,
Seq[BlockManagerId]]().asScala
Review comment:
Instead of keeping track of both `<shuffleId, shuffleMergeId>` for
shuffle merger locations, I think we can piggy back on
`unregisterAllMapAndMergeOutput` to clear the `mergers` in `ShuffleStatus`
similarly in the `updateEpoch` (like this
[comment](https://github.com/apache/spark/pull/34122#discussion_r785388799)) we
can clear the `shufflePushMergerLocations`. This makes the code much simpler
and neat. Let me know if you think otherwise or this has any other issues.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]