Up until last week we had no problems running a Spark standalone cluster. We now have a problem registering executors with the driver node in any application. Although we can start-all and see the worker on 8080 no executors are registered with the blockmanager.
The feedback we have is scant but we're getting stuff like this suggesting it's a name resolution issue of some kind: 14/04/09 08:22:58 INFO Master: akka.tcp://spark@Spark0:51214 got disassociated, removing it. 14/04/09 08:22:58 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster@100.92.60.69:7077] -> [akka.tcp://spark@Spark0:51214]: Error [Association failed with [akka.tcp://spark@Spark0:51214]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@Spark0:51214] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: Spark0/100.92.60.69:51214 any insight would be helpful? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/executors-not-registering-with-the-driver-tp4000.html Sent from the Apache Spark User List mailing list archive at Nabble.com.