On Sat, Apr 12, 2014 at 9:19 AM, ge ko <koenig....@gmail.com> wrote:
> Hi,
>
> I'm wondering why the master is registering itself at startup, exactly 3
> times (same number as the number of workers). Log excerpt:
> ""
> 2014-04-11 21:08:15,363 INFO akka.event.slf4j.Slf4jLogger: Slf4jLogger
> started
> 2014-04-11 21:08:15,478 INFO Remoting: Starting remoting
> 2014-04-11 21:08:15,838 INFO Remoting: Remoting started; listening on
> addresses :[akka.tcp://sparkMaster@hadoop-pg-5.cluster:7077]
> 2014-04-11 21:08:16,252 INFO org.apache.spark.deploy.master.Master: Starting
> Spark master at spark://hadoop-pg-5.cluster:7077
> 2014-04-11 21:08:16,299 INFO org.eclipse.jetty.server.Server:
> jetty-7.x.y-SNAPSHOT
> 2014-04-11 21:08:16,341 INFO org.eclipse.jetty.server.AbstractConnector:
> Started SelectChannelConnector@0.0.0.0:18080
> 2014-04-11 21:08:16,343 INFO org.apache.spark.deploy.master.ui.MasterWebUI:
> Started Master web UI at http://hadoop-pg-5.cluster:18080
> 2014-04-11 21:08:16,374 INFO org.apache.spark.deploy.master.Master: I have
> been elected leader! New state: ALIVE
> 2014-04-11 21:08:21,492 INFO org.apache.spark.deploy.master.Master:
> Registering worker hadoop-pg-5.cluster:7078 with 2 cores, 64.0 MB RAM
> 2014-04-11 21:08:31,362 INFO org.apache.spark.deploy.master.Master:
> Registering worker hadoop-pg-5.cluster:7078 with 2 cores, 64.0 MB RAM
> 2014-04-11 21:08:34,819 INFO org.apache.spark.deploy.master.Master:
> Registering worker hadoop-pg-5.cluster:7078 with 2 cores, 64.0 MB RAM
> ""
>
> The workers should be hadoop-pg-7/-8/-9
> This seems strange to me, or do I just interpret the log entry wrong ?!?!
> Perhaps this relates to my other post titled "TaskSchedulerImpl: Initial job
> has not accepted any resources" and both issues are caused by the same
> problem/misconfiguration.
>
> All the nodes can reach each other via ping and telnet on the corresponding
> ports.
>
> Any hints what is going wrong there?

I had this happen in a virtualbox configuration I was testing with,
and never got it fixed, but suspected it was because I wasn't using
FQDNs (as you're also not doing). I had found - can't find it again
now, of course - a message in the archives from somebody also
suffering from this issue while still using FQDNs, but claimed that
they found an error in /etc/hosts.

Good luck.

Reply via email to