agrawaldevesh commented on a change in pull request #29422:
URL: https://github.com/apache/spark/pull/29422#discussion_r470889426



##########
File path: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
##########
@@ -136,7 +137,9 @@ private[spark] class TaskSchedulerImpl(
   // IDs of the tasks running on each executor
   private val executorIdToRunningTaskIds = new HashMap[String, HashSet[Long]]
 
-  private val executorsPendingDecommission = new HashMap[String, 
ExecutorDecommissionInfo]
+  val executorsPendingDecommission = new HashMap[String, 
ExecutorDecommissionInfo]
+  // map of second to list of executors to clear form the above map
+  val decommissioningExecutorsToGc = new util.TreeMap[Long, 
mutable.ArrayBuffer[String]]()

Review comment:
       Sure. Any structure that lets me GC by time will do. I just wanted 
something lightweight and custom to this use case. 
   
   I expect the treemap to contain no more than 60 seconds worth of entries 
since things are keyed by the second, and they are also cleaned up on every 
check. The check happens on every executor loss and fetch failures. But yeah it 
is possible that if there are no failures then the entries could just sit there 
:-P. 
   
   I will change it to Cache. good idea. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to