attilapiros commented on a change in pull request #32114:
URL: https://github.com/apache/spark/pull/32114#discussion_r612373538
##########
File path: core/src/main/scala/org/apache/spark/SparkEnv.scala
##########
@@ -355,6 +355,12 @@ object SparkEnv extends Logging {
// Mapping from block manager id to the block manager's information.
val blockManagerInfo = new concurrent.TrieMap[BlockManagerId,
BlockManagerInfo]()
+ // Using a cache here since we only want to track recently removed
executors to deny their
+ // block manager registration while their StopExecutor message is
in-flight.
+ // Assuming average size of 6 bytes of execId and each entry in Cache
taking around 64 bytes,
+ // max size of this cache = (6 + 64) * 30000 = 2.1MB
+ val recentlyRemovedExecutors = CacheBuilder.newBuilder().maximumSize(30000)
+ .build[String, String]()
Review comment:
I think another possible solution is to extend the `BlockManagerInfo`
with the timestamp of the removing. So modelling the removing as a new state
and this way we could avoid using this separate cache completely and all the bm
related data would be in the same place.
Of course in this case you should implement the cleanup.
For example it could be just a simple Long var which is 0 by default which
means the BlockManager is alive/active (this special value can be hidden behind
a method of `BlockMangerInfo` like `isAlive(currentTs)`). The cleanup would
triggered for delay plus some extra time to avoid too frequent iteration on the
`blockManagerInfo` collection.
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]