attilapiros commented on a change in pull request #30004:
URL: https://github.com/apache/spark/pull/30004#discussion_r588867352
##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -774,6 +783,18 @@ private[spark] class MapOutputTrackerMaster(
}
}
+ def getAllMapOutputStatuses(shuffleId: Int): Array[MapStatus] = {
+ logDebug(s"Fetching all output statuses for shuffle $shuffleId")
+ shuffleStatuses.get(shuffleId) match {
+ case Some(shuffleStatus) =>
+ shuffleStatus.withMapStatuses { statuses =>
+ MapOutputTracker.checkMapStatuses(statuses, shuffleId)
+ statuses.clone
Review comment:
It could not be enough if the metadata can mutate. But as I see we could
solve all the problems with immutable metadata easily. So to be on the safe
side please document we require the metadata to be immutable and introduce an
`updateMetadata(meta: Option[Serializable])` method in `MapStatus`. Then we
will be safe and all the use cases are covered.
(And you can use a case class for the Uber RSS's `MapTaskRssInfo`)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]