Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/20604#discussion_r185005638
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1643,7 +1646,10 @@ class SparkContext(config: SparkConf) extends
Logging {
def killExecutors(executorIds: Seq[String]): Boolean = {
schedulerBackend match {
case b: ExecutorAllocationClient =>
- b.killExecutors(executorIds, replace = false, force =
true).nonEmpty
+ require(executorAllocationManager.isEmpty,
--- End diff --
My point in general is that the semantics of combining
`SparkContext.killExecutors()` (which is a publicly visible function, which the
end user can call) with dynamic allocation aren't well defined, and I have no
idea what the behavior really should be. I was giving some examples of weird
behavior.
>> If you've got just one executor, and then you kill it, should your app
sit with 0 executors?
> if app sit with 0 executors, then pending tasks increase, which lead to
ExecutorAllocationManager increases target number of executors. So, app will
not always sit with 0 executors.
Thats true -- but only *when* pending tasks increase. But if you've got 0
executors, how do you expect pending tasks to increase? That would only happen
when another taskset gets submitted, but with no executors your spark program
will probably just be blocked.
In the other case, I'm just trying to point out strange interactions
between user control and dynamic allocation control. Imagine this sequence:
Dynamic Allocation: 1000 tasks, so 1000 executors
User: I only want 10 executors, so let me tell spark to kill 990 of them
...
... another taskset is submitted to add 1 more task ...
Dynamic Allocation: 1001 tasks, so 1001 executors
User: ??? I set the target to 10 executors, what happened?
> So, if we are not using ExecutorAllocationManager, allocation client
will request requestedTotalExecutors = 0 number of executors to cluster manager
(this is really terrible)
hmm, from a quick look, I think you're right. it doesn't seem that using
`sc.killExecutors()` doesn't make sense even with dynamic allocation off. I
think `CoarseGrainedSchedulerBackend` should actually initiliaze
`requestedTotalExecutors` with
`SchedulerBackendUtils.getInitialTargetExecutorNumber`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]