[ https://issues.apache.org/jira/browse/SPARK-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen resolved SPARK-4991. ------------------------------- Resolution: Won't Fix Now that SPARK-5293, which removes Spark Core's dependency on Akka, has been completed for Spark 2.0, I'm going to mark this issue as "Won't Fix." > Worker should reconnect to Master when Master actor restart > ----------------------------------------------------------- > > Key: SPARK-4991 > URL: https://issues.apache.org/jira/browse/SPARK-4991 > Project: Spark > Issue Type: Improvement > Components: Deploy, Spark Core > Affects Versions: 1.0.0, 1.1.0, 1.2.0 > Reporter: Zhang, Liye > > This is a following JIRA of > [SPARK-4989|https://issues.apache.org/jira/browse/SPARK-4989]. when Master > akka actor encounter an exception, the Master will restart (akka actor > restart not JVM restart). And all old information are cleared on Master > (including workers, applications, etc). However, the workers are not aware of > this at all. The state of the cluster is that: the master is on, and all > workers are also on, but master is not aware of the exists of workers, and > will ignore all worker's heartbeat because all workers are not registered. So > that the whole cluster is not available. > For some other information about this part, please refer to > [SPARK-3736|https://issues.apache.org/jira/browse/SPARK-3736] and > [SPARK-4592|https://issues.apache.org/jira/browse/SPARK-4592] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org