mridulm commented on a change in pull request #27313: [SPARK-29148][CORE] Add
stage level scheduling dynamic allocation and scheduler backend changes
URL: https://github.com/apache/spark/pull/27313#discussion_r375106596
##########
File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
##########
@@ -423,21 +510,30 @@ private[spark] class ExecutorAllocationManager(
*/
private def removeExecutors(executors: Seq[String]): Seq[String] =
synchronized {
val executorIdsToBeRemoved = new ArrayBuffer[String]
-
logDebug(s"Request to remove executorIds: ${executors.mkString(", ")}")
- val numExistingExecutors = executorMonitor.executorCount -
executorMonitor.pendingRemovalCount
-
- var newExecutorTotal = numExistingExecutors
+ val numExecutorsTotalPerRpId = mutable.Map[Int, Int]()
executors.foreach { executorIdToBeRemoved =>
- if (newExecutorTotal - 1 < minNumExecutors) {
- logDebug(s"Not removing idle executor $executorIdToBeRemoved because
there are only " +
- s"$newExecutorTotal executor(s) left (minimum number of executor
limit $minNumExecutors)")
- } else if (newExecutorTotal - 1 < numExecutorsTarget) {
- logDebug(s"Not removing idle executor $executorIdToBeRemoved because
there are only " +
- s"$newExecutorTotal executor(s) left (number of executor target
$numExecutorsTarget)")
+ val rpId = getResourceProfileIdOfExecutor(executorIdToBeRemoved)
+ if (rpId == UNKNOWN_RESOURCE_PROFILE_ID) {
+ logWarning(s"Not removing executor $executorIdsToBeRemoved because
couldn't find " +
+ "ResourceProfile for it!")
Review comment:
nit: When testing, would be nice to assert this; we should not have this
situation right ? Or it is possible ?
Do we support cleaning up of resource profiles ?
This actually brings me to a general question - if all rdd's which are
referencing a resource profile have been gc'ed, do we also cleanup the cluster
resources allocated through that resource profile ? (idle timeout should do
this eventually).
What about references within our data structures for the profile to prevent
leaks ? (We can do this in a future work ofcourse if the intention is to clean
it - I want to understand if it is a possibility or whether the issue is not
expected to happen).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]