venkata91 commented on a change in pull request #30164:
URL: https://github.com/apache/spark/pull/30164#discussion_r516894315
##########
File path:
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala
##########
@@ -657,6 +679,14 @@ class BlockManagerMasterEndpoint(
}
}
+ private def getMergerLocations(
+ numMergersNeeded: Int,
+ hostsToFilter: Set[String]): Seq[BlockManagerId] = {
+ // Copying the merger locations to a list so that the original
mergerLocations won't be shuffled
+ val mergers = mergerLocations.values.filterNot(x =>
hostsToFilter.contains(x.host)).toSeq
+ Utils.randomize(mergers).take(numMergersNeeded)
Review comment:
Hm. We haven't really done much of experiments at this point. But this
is an interesting area to explore further. Another thing we can possibly do is
pass the merger locations information to `ExecutorAllocationManager` as part of
`ShuffleMapStage` creation to give some sort of preference to these executors
when we remove executors between the stages.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]