Josh Rosen created SPARK-18553:
----------------------------------

             Summary: Executor loss may cause TaskSetManager to be leaked
                 Key: SPARK-18553
                 URL: https://issues.apache.org/jira/browse/SPARK-18553
             Project: Spark
          Issue Type: Bug
          Components: Scheduler
    Affects Versions: 2.0.0, 1.6.0, 2.1.0
            Reporter: Josh Rosen
            Assignee: Josh Rosen
            Priority: Blocker


Due to a bug in TaskSchedulerImpl, the complete sudden loss of an executor may 
cause a TaskSetManager to be leaked, causing ShuffleDependencies and other data 
structures to be kept alive indefinitely, leading to various types of resource 
leaks (including shuffle file leaks).

In a nutshell, the problem is that TaskSchedulerImpl did not maintain its own 
mapping from executorId to running task ids, leaving it unable to clean up 
taskId to taskSetManager maps when an executor is totally lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to