mridulm commented on a change in pull request #32730:
URL: https://github.com/apache/spark/pull/32730#discussion_r646055096
##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -154,14 +159,26 @@ private class ShuffleStatus(
*/
def updateMapOutput(mapId: Long, bmAddress: BlockManagerId): Unit =
withWriteLock {
try {
- val mapStatusOpt = mapStatuses.find(_.mapId == mapId)
+ val mapStatusOpt = mapStatuses.find(x => x != null && x.mapId == mapId)
mapStatusOpt match {
case Some(mapStatus) =>
logInfo(s"Updating map output for ${mapId} to ${bmAddress}")
mapStatus.updateLocation(bmAddress)
invalidateSerializedMapOutputStatusCache()
case None =>
- logWarning(s"Asked to update map output ${mapId} for untracked map
status.")
+ val index = mapStatusesDeleted.indexWhere(x => x != null && x.mapId
== mapId)
+ if (index >= 0) {
+ val mapStatus = mapStatusesDeleted(index)
+ mapStatus.updateLocation(bmAddress)
+ assert(mapStatuses(index) == null)
Review comment:
I am fine with the PR, just had one q:
I have not looked into decomissioning in detail - but this assumption will
hold right ?
Can there be an interleaving recomputation or speculative task which updates
MOT for that index (not mapId) ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]