attilapiros commented on a change in pull request #30004:
URL: https://github.com/apache/spark/pull/30004#discussion_r588826812
##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -827,6 +848,13 @@ private[spark] class MapOutputTrackerWorker(conf:
SparkConf) extends MapOutputTr
}
}
+ override def getAllMapOutputStatuses(shuffleId: Int): Array[MapStatus] = {
+ logDebug(s"Fetching all output statuses for shuffle $shuffleId")
+ val statuses = getStatuses(shuffleId, conf)
Review comment:
Please clear the `mapStatuses` in case of `MetadataFetchFailedException`!
Reasoning:
The `getStatuses` method before this PR was only used in
`getMapSizesByExecutorId ` where the `MetadataFetchFailedException` (the case
when missing output location was detected) handled by clearing of the
`mapStatuses` cache as it is probably outdated.
~~I am sure that clearing would not be missed if this cleaning would be done
at the throwing of that exception.~~
~~Could you please check whether it can be moved there?~~
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]