mridulm commented on a change in pull request #27313: [SPARK-29148][CORE] Add 
stage level scheduling dynamic allocation and scheduler backend changes
URL: https://github.com/apache/spark/pull/27313#discussion_r375106596
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
 ##########
 @@ -423,21 +510,30 @@ private[spark] class ExecutorAllocationManager(
    */
   private def removeExecutors(executors: Seq[String]): Seq[String] = 
synchronized {
     val executorIdsToBeRemoved = new ArrayBuffer[String]
-
     logDebug(s"Request to remove executorIds: ${executors.mkString(", ")}")
-    val numExistingExecutors = executorMonitor.executorCount - 
executorMonitor.pendingRemovalCount
-
-    var newExecutorTotal = numExistingExecutors
+    val numExecutorsTotalPerRpId = mutable.Map[Int, Int]()
     executors.foreach { executorIdToBeRemoved =>
-      if (newExecutorTotal - 1 < minNumExecutors) {
-        logDebug(s"Not removing idle executor $executorIdToBeRemoved because 
there are only " +
-          s"$newExecutorTotal executor(s) left (minimum number of executor 
limit $minNumExecutors)")
-      } else if (newExecutorTotal - 1 < numExecutorsTarget) {
-        logDebug(s"Not removing idle executor $executorIdToBeRemoved because 
there are only " +
-          s"$newExecutorTotal executor(s) left (number of executor target 
$numExecutorsTarget)")
+      val rpId = getResourceProfileIdOfExecutor(executorIdToBeRemoved)
+      if (rpId == UNKNOWN_RESOURCE_PROFILE_ID) {
+        logWarning(s"Not removing executor $executorIdsToBeRemoved because 
couldn't find " +
+          "ResourceProfile for it!")
 
 Review comment:
   nit: When testing, would be nice to assert this; we should not have this 
situation right ? Or it is possible ?
   Do we support cleaning up of resource profiles ? 
   
   This actually brings me to a general question - if all rdd's which are 
referencing a resource profile have  been gc'ed, do we also cleanup the cluster 
resources allocated through that resource profile ? (idle timeout should do 
this eventually).
   What about references within our data structures for the profile to prevent 
leaks ? (We can do this in a  future work ofcourse if the intention is to clean 
it - I want to  understand if it is a possibility or whether the issue is not 
expected to happen).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to