[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

squito Fri, 26 Oct 2018 12:29:16 -0700

Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22288#discussion_r228642015
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
    @@ -415,9 +420,54 @@ private[spark] class TaskSchedulerImpl(
                 launchedAnyTask |= launchedTaskAtCurrentMaxLocality
               } while (launchedTaskAtCurrentMaxLocality)
             }
    +
             if (!launchedAnyTask) {
    -          taskSet.abortIfCompletelyBlacklisted(hostToExecutors)
    +          taskSet.getCompletelyBlacklistedTaskIfAny(hostToExecutors) match 
{
    +            case Some(taskIndex) => // Returns the taskIndex which was 
unschedulable
    +
    +              // If the taskSet is unschedulable we try to find an 
existing idle blacklisted
    +              // executor. If we cannot find one, we abort immediately. 
Else we kill the idle
    +              // executor and kick off an abortTimer which if it doesn't 
schedule a task within the
    +              // the timeout will abort the taskSet if we were unable to 
schedule any task from the
    +              // taskSet.
    +              // Note 1: We keep track of schedulability on a per taskSet 
basis rather than on a per
    +              // task basis.
    +              // Note 2: The taskSet can still be aborted when there are 
more than one idle
    +              // blacklisted executors and dynamic allocation is on. This 
can happen when a killed
    +              // idle executor isn't replaced in time by 
ExecutorAllocationManager as it relies on
    +              // pending tasks and doesn't kill executors on idle 
timeouts, resulting in the abort
    +              // timer to expire and abort the taskSet.
    +              executorIdToRunningTaskIds.find(x => !isExecutorBusy(x._1)) 
match {
    +                case Some ((executorId, _)) =>
    +                  if (!unschedulableTaskSetToExpiryTime.contains(taskSet)) 
{
    +                    blacklistTrackerOpt.foreach(blt => 
blt.killBlacklistedIdleExecutor(executorId))
    +
    +                    val timeout = 
conf.get(config.UNSCHEDULABLE_TASKSET_TIMEOUT) * 1000
    +                    unschedulableTaskSetToExpiryTime(taskSet) = 
clock.getTimeMillis() + timeout
    +                    logInfo(s"Waiting for $timeout ms for completely "
    +                      + s"blacklisted task to be schedulable again before 
aborting $taskSet.")
    +                    abortTimer.schedule(
    +                      createUnschedulableTaskSetAbortTimer(taskSet, 
taskIndex), timeout)
    +                  }
    +                case _ => // Abort Immediately
    +                  logInfo("Cannot schedule any task because of complete 
blacklisting. No idle" +
    +                    s" executors can be found to kill. Aborting $taskSet." 
)
    +                  taskSet.abortSinceCompletelyBlacklisted(taskIndex)
    +              }
    +            case _ => // Do nothing if no tasks completely blacklisted.
    --- End diff --
    
    doesn't matter a ton, I think its just a scala-ism it takes a while to get 
used to.  my rough guidline is: use pattern-matching if you're doing something 
distinct in both the Some and None cases, or if you can make use of more 
complex patterns to avoid more nesting (eg. `case Some(x) if x.isFoo() =>`).  
If you're only doing something in the Some branch, then generally prefer map, 
foreach, filter, etc.
    
    My reason for wanting it here is that when I look at this code, I needed to 
scroll back to figure out what you were even matching on here and make sure you 
weren't ignoring something important.  When I see the `match` up above, I 
assume something is going to happen in both branches.  OTOH if there was a 
`foreach`, when I see the foreach I know right away you're ignoring None.
    
    again this is really minor, I don't actually care that much, just 
explaining my thinking.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

Reply via email to