[GitHub] spark pull request #15986: [SPARK-18553][CORE][branch-2.0] Fix leak of TaskS...

JoshRosen Tue, 22 Nov 2016 21:20:07 -0800

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15986#discussion_r89257822
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
    @@ -339,7 +341,7 @@ private[spark] class TaskSchedulerImpl(
               // We lost this entire executor, so remember that it's gone
               val execId = taskIdToExecutorId(tid)
     
    -          if (executorIdToTaskCount.contains(execId)) {
    +          if (executorIdToRunningTaskIds.contains(execId)) {
                 reason = Some(
                   SlaveLost(s"Task $tid was lost, so marking the executor as 
lost as well."))
                 removeExecutor(execId, reason.get)
    --- End diff --
    
    To look at this another way, note that `TaskSchedulerImpl` is only called 
from three places:
    
    - LocalSchedulerBackend, which won't use `TaskState.LOST`
    - MesosFineGrainedSchedulerBackend, where `TaskState.LOST` means the total 
loss of an executor that corresponded to a single task (due to fine-grained 
mode)
    - CoarseGrainedSchedulerBackend, where this is only called with a state 
that comes from a `StatusUpdate` message sent by an executor. This task state 
will never be `TaskState.LOST`.
    
    Given all of this, I think that the right course of action here is to 
update the comments to clarify that `TaskState.LOST` is only relevant to 
fine-grained Mesos mode and to refactor this block to call
    
    ```
                  taskSet.removeRunningTask(tid)
                  taskResultGetter.enqueueFailedTask(taskSet, tid, state, 
serializedData)
    ```
    
    after calling `removeExecutor` for the fine-grained task, then skipping the 
rest of the logic which only applies to local mode or coarse-grained schedulers.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #15986: [SPARK-18553][CORE][branch-2.0] Fix leak of TaskS...

Reply via email to