GitHub user liyezhang556520 opened a pull request: https://github.com/apache/spark/pull/3825
[SPARK-4991][CORE] Worker should reconnect to Master when Master actor restart This is a following JIRA of [SPARK-4989](https://issues.apache.org/jira/browse/SPARK-4991). when Master akka actor encounter an exception, the Master will restart (akka actor restart not JVM restart). And all old information are cleared on Master (including workers, applications, etc). However, the workers are not aware of this at all. The state of the cluster is that: the master is on, and all workers are also on, but master is not aware of the exists of workers, and will ignore all worker's heartbeat because all workers are not registered. So that the whole cluster is not available. In this PR, master will tell worker the connection is disconnected, so that worker will register to master again. You can merge this pull request into a Git repository by running: $ git pull https://github.com/liyezhang556520/spark workerReconn Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3825.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3825 ---- commit 107e5c58fdbe143fe6eabcfdb5d91d7b1184bb35 Author: Zhang, Liye <liye.zh...@intel.com> Date: 2014-12-29T07:35:45Z worker reconnect to master when master restart for exception commit e9c99e3969f6e058e46d65575d796d1289351318 Author: Zhang, Liye <liye.zh...@intel.com> Date: 2014-12-29T08:51:50Z add log info ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org