Hi, all I’m trying to deploy spark in standalone mode, everything goes as usual,
the webUI is accessible, the master node wrote some logs saying all workers are registered 14/01/15 01:37:30 INFO Slf4jEventHandler: Slf4jEventHandler started 14/01/15 01:37:31 INFO ActorSystemImpl: RemoteServerStarted@akka://[email protected]:7077 14/01/15 01:37:31 INFO Master: Starting Spark master at spark://172.31.36.93:7077 14/01/15 01:37:31 INFO MasterWebUI: Started Master web UI at http://ip-172-31-36-93.us-west-2.compute.internal:8080 14/01/15 01:37:31 INFO Master: I have been elected leader! New state: ALIVE 14/01/15 01:37:34 INFO ActorSystemImpl: RemoteClientStarted@akka://[email protected]:37914 14/01/15 01:37:34 INFO ActorSystemImpl: RemoteClientStarted@akka://[email protected]:43055 14/01/15 01:37:34 INFO Master: Registering worker ip-172-31-34-61.us-west-2.compute.internal:37914 with 2 cores, 6.3 GB RAM 14/01/15 01:37:34 INFO ActorSystemImpl: RemoteClientStarted@akka://[email protected]:55355 14/01/15 01:37:34 INFO Master: Registering worker ip-172-31-40-28.us-west-2.compute.internal:43055 with 2 cores, 6.3 GB RAM 14/01/15 01:37:34 INFO Master: Registering worker ip-172-31-45-211.us-west-2.compute.internal:55355 with 2 cores, 6.3 GB RAM 14/01/15 01:37:34 INFO ActorSystemImpl: RemoteClientStarted@akka://[email protected]:47709 14/01/15 01:37:34 INFO Master: Registering worker ip-172-31-41-251.us-west-2.compute.internal:47709 with 2 cores, 6.3 GB RAM 14/01/15 01:37:34 INFO ActorSystemImpl: RemoteClientStarted@akka://[email protected]:36257 14/01/15 01:37:34 INFO Master: Registering worker ip-172-31-43-78.us-west-2.compute.internal:36257 with 2 cores, 6.3 GB RAM 14/01/15 01:38:44 INFO ActorSystemImpl: RemoteClientStarted@akka://[email protected]:43086 However, when I launched an application, the master firstly “attempted to re-register the worker” and then said that all heartbeats are from “unregistered” workers. Can anyone told me what happened here? 14/01/15 01:38:44 INFO Master: Registering app ALS 14/01/15 01:38:44 INFO Master: Registered app ALS with ID app-20140115013844-0000 14/01/15 01:38:44 INFO Master: Launching executor app-20140115013844-0000/0 on worker worker-20140115013734-ip-172-31-43-78.us-west-2.compute.internal-36257 14/01/15 01:38:44 INFO Master: Launching executor app-20140115013844-0000/1 on worker worker-20140115013734-ip-172-31-40-28.us-west-2.compute.internal-43055 14/01/15 01:38:44 INFO Master: Launching executor app-20140115013844-0000/2 on worker worker-20140115013734-ip-172-31-34-61.us-west-2.compute.internal-37914 14/01/15 01:38:44 INFO Master: Launching executor app-20140115013844-0000/3 on worker worker-20140115013734-ip-172-31-45-211.us-west-2.compute.internal-55355 14/01/15 01:38:44 INFO Master: Launching executor app-20140115013844-0000/4 on worker worker-20140115013734-ip-172-31-41-251.us-west-2.compute.internal-47709 14/01/15 01:38:44 INFO Master: Registering worker ip-172-31-40-28.us-west-2.compute.internal:43055 with 2 cores, 6.3 GB RAM 14/01/15 01:38:44 INFO Master: Attempted to re-register worker at same address: akka://[email protected]:43055 14/01/15 01:38:44 INFO Master: Registering worker ip-172-31-34-61.us-west-2.compute.internal:37914 with 2 cores, 6.3 GB RAM 14/01/15 01:38:44 INFO Master: Attempted to re-register worker at same address: akka://[email protected]:37914 14/01/15 01:38:44 INFO Master: Registering worker ip-172-31-41-251.us-west-2.compute.internal:47709 with 2 cores, 6.3 GB RAM 14/01/15 01:38:44 INFO Master: Attempted to re-register worker at same address: akka://[email protected]:47709 14/01/15 01:38:44 INFO Master: Registering worker ip-172-31-45-211.us-west-2.compute.internal:55355 with 2 cores, 6.3 GB RAM 14/01/15 01:38:44 INFO Master: Attempted to re-register worker at same address: akka://[email protected]:55355 14/01/15 01:38:44 INFO Master: Registering worker ip-172-31-43-78.us-west-2.compute.internal:36257 with 2 cores, 6.3 GB RAM 14/01/15 01:38:44 INFO Master: Attempted to re-register worker at same address: akka://[email protected]:36257 14/01/15 01:38:44 WARN Master: Got heartbeat from unregistered worker worker-20140115013844-ip-172-31-34-61.us-west-2.compute.internal-37914 14/01/15 01:38:44 WARN Master: Got heartbeat from unregistered worker worker-20140115013844-ip-172-31-45-211.us-west-2.compute.internal-55355 14/01/15 01:38:44 WARN Master: Got heartbeat from unregistered worker worker-20140115013844-ip-172-31-40-28.us-west-2.compute.internal-43055 14/01/15 01:38:44 WARN Master: Got heartbeat from unregistered worker worker-20140115013844-ip-172-31-43-78.us-west-2.compute.internal-36257 14/01/15 01:38:44 WARN Master: Got heartbeat from unregistered worker worker-20140115013844-ip-172-31-41-251.us-west-2.compute.internal-47709 14/01/15 01:38:50 WARN Master: Got heartbeat from unregistered worker worker-20140115013844-ip-172-31-45-211.us-west-2.compute.internal-55355 Thank you very much! Best, -- Nan Zhu
