Repository: spark
Updated Branches:
  refs/heads/branch-1.6 d1654864a -> ced71d353


[SPARK-13519][CORE] Driver should tell Executor to stop itself when cleaning 
executor's state

## What changes were proposed in this pull request?

When the driver removes an executor's state, the connection between the driver 
and the executor may be still alive so that the executor cannot exit 
automatically (E.g., Master will send RemoveExecutor when a work is lost but 
the executor is still alive), so the driver should try to tell the executor to 
stop itself. Otherwise, we will leak an executor.

This PR modified the driver to send `StopExecutor` to the executor when it's 
removed.

## How was this patch tested?

manual test: increase the worker heartbeat interval to force it's always 
timeout and the leak executors are gone.

Author: Shixiong Zhu <shixi...@databricks.com>

Closes #11399 from zsxwing/SPARK-13519.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c433c0af
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c433c0af
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c433c0af

Branch: refs/heads/branch-1.6
Commit: c433c0afd4c3f96ef24686a1f28262af81b67723
Parents: d165486
Author: Shixiong Zhu <shixi...@databricks.com>
Authored: Fri Feb 26 15:11:57 2016 -0800
Committer: Andrew Or <and...@databricks.com>
Committed: Wed May 11 11:29:01 2016 -0700

----------------------------------------------------------------------
 .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala  | 4 ++++
 1 file changed, 4 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/c433c0af/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
index 505c161..7189685 100644
--- 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
+++ 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
@@ -179,6 +179,10 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
         context.reply(true)
 
       case RemoveExecutor(executorId, reason) =>
+        // We will remove the executor's state and cannot restore it. However, 
the connection
+        // between the driver and the executor may be still alive so that the 
executor won't exit
+        // automatically, so try to tell the executor to stop itself. See 
SPARK-13519.
+        
executorDataMap.get(executorId).foreach(_.executorEndpoint.send(StopExecutor))
         removeExecutor(executorId, reason)
         context.reply(true)
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to