attilapiros commented on a change in pull request #30004:
URL: https://github.com/apache/spark/pull/30004#discussion_r588826812



##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -827,6 +848,13 @@ private[spark] class MapOutputTrackerWorker(conf: 
SparkConf) extends MapOutputTr
     }
   }
 
+  override def getAllMapOutputStatuses(shuffleId: Int): Array[MapStatus] = {
+    logDebug(s"Fetching all output statuses for shuffle $shuffleId")
+    val statuses = getStatuses(shuffleId, conf)

Review comment:
       Please clean up the `mapStatuses` in case of 
`MetadataFetchFailedException`!
   
   Reasoning:
   The `getStatuses` method before this PR was only used in 
`getMapSizesByExecutorId ` where the `MetadataFetchFailedException` (the case 
when missing output location was detected) handled by cleaning of the 
`mapStatuses` cache as it is probably outdated.
   
   
   I am sure that cleaning would not be missed if this cleaning would be done 
at the throwing of that exception.
   Could you please check whether it can be moved there?
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to