mridulm commented on a change in pull request #30164:
URL: https://github.com/apache/spark/pull/30164#discussion_r520365496
##########
File path:
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala
##########
@@ -74,6 +74,14 @@ class BlockManagerMasterEndpoint(
// Mapping from block id to the set of block managers that have the block.
private val blockLocations = new JHashMap[BlockId,
mutable.HashSet[BlockManagerId]]
+ // Mapping from host name to shuffle (mergers) services where the current app
+ // registered an executor in the past. Older hosts are removed when the
+ // maxRetainedMergerLocations size is reached in favor of newer locations.
+ private val shuffleMergerLocations = new mutable.LinkedHashMap[String,
BlockManagerId]()
Review comment:
@Ngone51 That is a good idea, is there any concern with adding this
@venkata91 ?
But I would like to add that it would help in only a subset of cases.
What I mean is, when fetch of a merged block fails, executors will fallback
to fetching the constituent blocks [1]. A fetch failure would be reported to
driver only when both of these fetches fail - the merged block and the mapper
output shuffle block fetch. The fallback mechanism would mean that if the
individual blocks for a merged block were not computed on the lost host for the
parent stage, we wont see the fetch failure.
With the recent changes to prefer hosts with active executors, we do improve
the chances of detecting lost merger candidates due to lost hosts though.
[1] Ignoring cases like large blocks which are not candidates for merge, etc.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]