Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20604#discussion_r185005638
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1643,7 +1646,10 @@ class SparkContext(config: SparkConf) extends 
Logging {
       def killExecutors(executorIds: Seq[String]): Boolean = {
         schedulerBackend match {
           case b: ExecutorAllocationClient =>
    -        b.killExecutors(executorIds, replace = false, force = 
true).nonEmpty
    +        require(executorAllocationManager.isEmpty,
    --- End diff --
    
    My point in general is that the semantics of combining 
`SparkContext.killExecutors()` (which is a publicly visible function, which the 
end user can call) with dynamic allocation aren't well defined, and I have no 
idea what the behavior really should be.  I was giving some examples of weird 
behavior.  
    
    >> If you've got just one executor, and then you kill it, should your app 
sit with 0 executors?
    
    > if app sit with 0 executors, then pending tasks increase, which lead to 
ExecutorAllocationManager increases target number of executors. So, app will 
not always sit with 0 executors.
    
    Thats true -- but only *when* pending tasks increase.  But if you've got 0 
executors, how do you expect pending tasks to increase?  That would only happen 
when another taskset gets submitted, but with no executors your spark program 
will probably just be blocked.
    
    In the other case, I'm just trying to point out strange interactions 
between user control and dynamic allocation control.  Imagine this sequence:
    
    Dynamic Allocation: 1000 tasks, so 1000 executors
    User: I only want 10 executors, so let me tell spark to kill 990 of them
    ...
    ... another taskset is submitted to add 1 more task ...
    Dynamic Allocation: 1001 tasks, so 1001 executors
    User: ??? I set the target to 10 executors, what happened?
    
    
    >  So, if we are not using ExecutorAllocationManager, allocation client 
will request requestedTotalExecutors = 0 number of executors to cluster manager 
(this is really terrible)
    
    hmm, from a quick look, I think you're right.  it doesn't seem that using 
`sc.killExecutors()` doesn't make sense even with dynamic allocation off.  I 
think `CoarseGrainedSchedulerBackend` should actually initiliaze 
`requestedTotalExecutors` with 
`SchedulerBackendUtils.getInitialTargetExecutorNumber`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to