sumeetgajjar commented on a change in pull request #32114:
URL: https://github.com/apache/spark/pull/32114#discussion_r612814973



##########
File path: core/src/main/scala/org/apache/spark/SparkEnv.scala
##########
@@ -355,6 +355,12 @@ object SparkEnv extends Logging {
 
     // Mapping from block manager id to the block manager's information.
     val blockManagerInfo = new concurrent.TrieMap[BlockManagerId, 
BlockManagerInfo]()
+    // Using a cache here since we only want to track recently removed 
executors to deny their
+    // block manager registration while their StopExecutor message is 
in-flight.
+    // Assuming average size of 6 bytes of execId and each entry in Cache 
taking around 64 bytes,
+    // max size of this cache = (6 + 64) * 30000 = 2.1MB
+    val recentlyRemovedExecutors = CacheBuilder.newBuilder().maximumSize(30000)
+      .build[String, String]()

Review comment:
       I believe extending `BlockManagerInfo` should solve the problem.
   However, we will have to abstract `blockManagerInfo: 
mutable.Map[BlockManagerId, BlockManagerInfo]`.
   Currently, on `RemoveExecutor`, we remove the corresponding 
`BlockManagerInfo` from `blockManagerInfo` map, now since during the removal, 
instead of removing, we update the `removalTs` inside `BlockManagerInfo`, other 
methods will have to filter the values in the map first before using them.
   
   So my suggestion here would be to abstract these details into a 
`BlockManagerEndpointSharedState` which holds the `blockManagerInfo` map and 
exposes methods for lookup and updates.
   I propose the name as `BlockManagerEndpointSharedState` since we will have 
to pass the same object to `BlockManagerMasterEndpoint` and 
`BlockManagerMasterHeartbeatEndpoint`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to