Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22288#discussion_r222811152
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
    @@ -415,9 +421,63 @@ private[spark] class TaskSchedulerImpl(
                 launchedAnyTask |= launchedTaskAtCurrentMaxLocality
               } while (launchedTaskAtCurrentMaxLocality)
             }
    +
             if (!launchedAnyTask) {
    -          taskSet.abortIfCompletelyBlacklisted(hostToExecutors)
    -        }
    +          taskSet.getCompletelyBlacklistedTaskIfAny(hostToExecutors) match 
{
    +            case taskIndex: Some[Int] => // Returns the taskIndex which 
was unschedulable
    +
    +              // If the taskSet is unschedulable we try to find an 
existing idle blacklisted
    +              // executor. If we cannot find one, we abort immediately. 
Else we kill the idle
    +              // executor and kick off an abortTimer which after waiting 
will abort the taskSet if
    +              // we were unable to schedule any task from the taskSet.
    +              // Note 1: We keep a track of schedulability on a per 
taskSet basis rather than on a
    +              // per task basis.
    +              // Note 2: The taskSet can still be aborted when there are 
more than one idle
    +              // blacklisted executors and dynamic allocation is on. This 
is because we rely on the
    --- End diff --
    
    I think the comment can be clarified/cleaned up a bit.
    
    I think the point here is that it can still be aborted if the executor we 
kill isn't replaced in time.    It doesn't explicitly idle timeout or kill 
other executors so the manager thinks it has enough.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to