Repository: spark Updated Branches: refs/heads/branch-1.6 d1654864a -> ced71d353
[SPARK-13519][CORE] Driver should tell Executor to stop itself when cleaning executor's state ## What changes were proposed in this pull request? When the driver removes an executor's state, the connection between the driver and the executor may be still alive so that the executor cannot exit automatically (E.g., Master will send RemoveExecutor when a work is lost but the executor is still alive), so the driver should try to tell the executor to stop itself. Otherwise, we will leak an executor. This PR modified the driver to send `StopExecutor` to the executor when it's removed. ## How was this patch tested? manual test: increase the worker heartbeat interval to force it's always timeout and the leak executors are gone. Author: Shixiong Zhu <shixi...@databricks.com> Closes #11399 from zsxwing/SPARK-13519. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c433c0af Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c433c0af Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c433c0af Branch: refs/heads/branch-1.6 Commit: c433c0afd4c3f96ef24686a1f28262af81b67723 Parents: d165486 Author: Shixiong Zhu <shixi...@databricks.com> Authored: Fri Feb 26 15:11:57 2016 -0800 Committer: Andrew Or <and...@databricks.com> Committed: Wed May 11 11:29:01 2016 -0700 ---------------------------------------------------------------------- .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala | 4 ++++ 1 file changed, 4 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/c433c0af/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala ---------------------------------------------------------------------- diff --git a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala index 505c161..7189685 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala @@ -179,6 +179,10 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp context.reply(true) case RemoveExecutor(executorId, reason) => + // We will remove the executor's state and cannot restore it. However, the connection + // between the driver and the executor may be still alive so that the executor won't exit + // automatically, so try to tell the executor to stop itself. See SPARK-13519. + executorDataMap.get(executorId).foreach(_.executorEndpoint.send(StopExecutor)) removeExecutor(executorId, reason) context.reply(true) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org