I was asking if you can ping the master from the slaves. Can you hit
the namenode from one or more of the remote datanodes? If so in the
hadoop-site.xml files on the datanodes, if the namenode variable
pointing to the fqdn of the namenode instead of local?
Dennis Kubes
Bolle, Jeffrey F. wrote:
Everything pings fine and nslookups all come back normally. The ssh
connections work just fine, as the bin/slaves.sh program will run and I
can check all of the uptimes remotely and everything.
Looking at the logs there is nothing out of the ordinary. I see Jetty
come up on each of the nodes as well as the main server. Jetty says it
is listening on 0.0.0.0:50070 for the namenode, 0.0.0.0:50060 for the
tasktracker, 0.0.0.0:50030 for the jobtracker, and 0.0.0.0:50075 for
the data node. The datanode logs on all of the clients had a no route
to host exception from earlier, but other than that there is nothing .
In the task tracker logs everything looks normal with Jetty starting.
When running a hadoop fsck / I see that the blocks aren't being
replicated to any of the servers (which makes complete sense with the
idea that my master isn't communicating with any of the slaves).
In my slaves file there is one fqdn per line for each of the 4
machines. This file is the same on all 4 machines. Any ideas on
debugging this?
Jeff
-----Original Message-----
From: Vishal Shah [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 07, 2007 1:44 AM
To: [email protected]
Subject: RE: Hadoop oddity
Hi Jeff,
Can you also try an nslookup for the master from the slave nodes?
Does
that work properly? Also, it would be good to see the jobtracker and
tasktracker logs.
-vishal.
-----Original Message-----
From: Dennis Kubes [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 07, 2007 9:58 AM
To: [email protected]
Subject: Re: Hadoop oddity
The other things to check would be ability to ping from slave nodes,
correct fqdn in the slave nodes hadoop-site.xml file, correct dns setup
for the master.
Dennis Kubes
Bolle, Jeffrey F. wrote:
The hosts file looks fine...still only showing 1 node.
Jeff
-----Original Message-----
From: Dennis Kubes [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 06, 2007 7:42 PM
To: [email protected]
Subject: Re: Hadoop oddity
If the hosts file on the namenode is not setup correctly it could be
listening only on localhost. Make sure your /etc/hosts file looks
something like this:
127.0.0.1 localhost, localhost.localdomain
x.x.x.x yourcomputer.domain.tld
Dennis Kubes
Bolle, Jeffrey F. wrote:
In theory I have a cluster with 4 nodes. When running something
like
bin/slaves.sh uptime I get the desired results (all four servers
respond with their uptimes). However, when I run a crawl only one
server, the host (which also acts as a slave), appears under the
nodes
display. This has happened after the primary server died and had
now
been rebuilt. Had anyone experienced this before or does anyone
have
any tips as to where to begin looking for the problem. Thanks.
Jeff