I have a "Connection Refused" error on the first worker (standalone cluster - no YARN, Mesos). No firewalls, and can ping master-worker nodes from the other.
Master process started manually. It is running and can see Web UI at 8080. Using "spark-0.9.0-incubating-bin-hadoop2.tgz" =============================================== spark-0.9.0-incubating-bin-hadoop2]$ ./bin/spark-class org.apache.spark.deploy.worker.Worker spark://s1.machine.org:7077 14/02/07 07:00:58 INFO Utils: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 14/02/07 07:00:58 WARN Utils: Your hostname, s2.machine.org resolves to a loopback address: 127.0.0.1; using 192.168.64.122 instead (on interface eth1) 14/02/07 07:00:58 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 14/02/07 07:00:59 INFO Slf4jLogger: Slf4jLogger started 14/02/07 07:00:59 INFO Remoting: Starting remoting 14/02/07 07:00:59 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkWorker@s2:49614] 14/02/07 07:00:59 INFO Worker: Starting Spark worker s2:49614 with 1 cores, 853.0 MB RAM 14/02/07 07:00:59 INFO Worker: Spark home: /home/vagrant/spark-0.9.0-incubating-bin-hadoop2 14/02/07 07:00:59 INFO WorkerWebUI: Started Worker web UI at http://s2:8081 14/02/07 07:00:59 INFO Worker: Connecting to master spark://s1.machine.org:7077... 14/02/07 07:00:59 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@s2:49614] -> [akka.tcp:// [email protected]:7077]: Error [Association failed with [akka.tcp://[email protected]:7077]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://[email protected]:7077] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: s1.machine.org/192.168.64.121:7077 ] 14/02/07 07:00:59 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@s2:49614] -> [akka.tcp:// [email protected]:7077]: Error [Association failed with [akka.tcp://[email protected]:7077]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://[email protected]:7077] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: s1.machine.org/192.168.64.121:7077 ] 14/02/07 07:00:59 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@s2:49614] -> [akka.tcp:// [email protected]:7077]: Error [Association failed with [akka.tcp://[email protected]:7077]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://[email protected]:7077] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: s1.machine.org/192.168.64.121:7077 ] 14/02/07 07:00:59 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@s2:49614] -> [akka.tcp:// [email protected]:7077]: Error [Association failed with [akka.tcp://[email protected]:7077]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://[email protected]:7077] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: s1.machine.org/192.168.64.121:7077 ] 14/02/07 07:00:59 INFO RemoteActorRefProvider$RemoteDeadLetterActorRef: Message [org.apache.spark.deploy.DeployMessages$RegisterWorker] from Actor[akka://sparkWorker/user/Worker#607746123] to Actor[akka://sparkWorker/deadLetters] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. ... 14/02/07 07:01:59 ERROR Worker: All masters are unresponsive! Giving up. ===============================================
