I have a rather odd problem.
We had some DNS issues and the DNS servers were reloaded over the
weekend.
Now, my nutch cluster can't see itself. I can easily ssh between
machines, name resolution seems to be working just fine, they can all
ping each other, etc. The problem is that when I run bin/start-all.sh
and check the web cluster summary only one node is connected (the slave
node started on the master machine). If I ssh into one of the slave
nodes and check the logs it is trying to connect to the master node,
but to no avail. Does anyone have any recommendations on where things
are messed up. Oh, I should add that everything properly rsyncs and
starts when start-all.sh is run, the only thing that doesn't happen is
the slaves connecting back to the master.
Thanks for any help.
Jeff