sumeetgajjar commented on a change in pull request #32114:
URL: https://github.com/apache/spark/pull/32114#discussion_r639053095



##########
File path: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala
##########
@@ -422,7 +430,7 @@ class BlockManagerMasterEndpoint(
     val locations = blockLocations.get(blockId)
     if (locations != null) {
       locations.foreach { blockManagerId: BlockManagerId =>
-        val blockManager = blockManagerInfo.get(blockManagerId)
+        val blockManager = 
blockManagerInfo.get(blockManagerId).filter(_.isAlive)

Review comment:
       > What if we maintain the inactive BlockManagerInfos in a separate data 
structure and remove the BlockManagerInfo from the blockManagerInfo as it is.
   
   Yes @Ngone51, you are right, we discussed this earlier. My original solution 
was exactly what you described, to have a Guava cache which stored inactive 
BlockManagerInfos. In this case, we also had to pass 
`InactiveBlockManagerCache` to `BlockManagerMasterHeartbeatEndpoint` so it can 
respond appropriately to `BlockManagerHeartbeat`.
   
   Later @attilapiros suggested to model BlockManager removal as a new state by 
adding `removalTS` to `BlockManagerInfo`. I found his solution was better than 
using the `InactiveBlockManagerCache`. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to