sumeetgajjar commented on a change in pull request #32114:
URL: https://github.com/apache/spark/pull/32114#discussion_r639053095
##########
File path:
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala
##########
@@ -422,7 +430,7 @@ class BlockManagerMasterEndpoint(
val locations = blockLocations.get(blockId)
if (locations != null) {
locations.foreach { blockManagerId: BlockManagerId =>
- val blockManager = blockManagerInfo.get(blockManagerId)
+ val blockManager =
blockManagerInfo.get(blockManagerId).filter(_.isAlive)
Review comment:
> What if we maintain the inactive BlockManagerInfos in a separate data
structure and remove the BlockManagerInfo from the blockManagerInfo as it is.
Yes @Ngone51, you are right, we discussed this earlier. My original solution
was exactly what you described, to have a Guava cache which stored inactive
BlockManagerInfos. In this case, we also had to pass
`InactiveBlockManagerCache` to `BlockManagerMasterHeartbeatEndpoint` so it can
respond appropriately to `BlockManagerHeartbeat`.
Later @attilapiros suggested to model BlockManager removal as a new state by
adding `removalTS` to `BlockManagerInfo`. I found his solution was better than
using the `InactiveBlockManagerCache`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]