attilapiros commented on a change in pull request #31763:
URL: https://github.com/apache/spark/pull/31763#discussion_r588996218
##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -774,6 +783,18 @@ private[spark] class MapOutputTrackerMaster(
}
}
+ def getAllMapOutputStatuses(shuffleId: Int): Array[MapStatus] = {
+ logDebug(s"Fetching all output statuses for shuffle $shuffleId")
+ shuffleStatuses.get(shuffleId) match {
+ case Some(shuffleStatus) =>
+ shuffleStatus.withMapStatuses { statuses =>
+ MapOutputTracker.checkMapStatuses(statuses, shuffleId)
+ statuses.clone
Review comment:
So as we discussed in
https://github.com/apache/spark/pull/30004#discussion_r588867352
To change it to `getAllMapOutputStatusMetadata` and only return the metadata
could be a solution extended with the restriction to allow only immutable
metadata.
>So to be on the safe side please document we require the metadata to be
immutable and introduce an updateMetadata(meta: Option[Serializable]) method in
MapStatus. Then we will be safe and all the use cases are covered.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]