RE: cluster startup problem

Sebastien Rainville Sat, 10 Nov 2007 16:09:19 -0800

This bug is driving me crazy! What tools could I use to find out why
slaves are not reported being part of the cluster? I can't find anything
wrong in the log files.


Using Wireshark, I confirmed that the heartbeat in between the slaves
and the master is working. The ssh communication in between the master
and the slaves is also working fine.

It's like everything is perfect... except for the fact that it's not...
The admin pages keep reporting that there's only one node in the cluster
(the master acts as a slave too and that one is working). Maybe the
problem is with the admin pages... Also, I'm seeing VERY slow transfer
rates and I don't see what could cause that either...

Any idea someone?




-----Original Message-----
From: Sebastien Rainville [mailto:[EMAIL PROTECTED] 
Sent: November 10, 2007 2:18 PM
To: [email protected]
Subject: cluster startup problem

Hi,

 

I have a cluster made of only 2 PCs. The master acts also as a slave.
The cluster seems to start properly. It is functional (I can access the
dfs, monitor it with the web interfaces, no errors in the log files...)
but it reports that only 1 node is up. For some reason the datanode on
the slave doesn't start properly. The weirdest thing is that it is
actually listed in the running processes when I run the command 'jps'
and the log file for the datanode exist but is empty... Another weird
thing is that the file hadoop.log is empty on the master. So, I can't
find any debugging information. Also, I don't know what to think about
the tasktracker on the slave... the log file seems fine (reporting that
it is starting properly) but I can't open the admin page in a browser.

 

I have another question... what is required for a client application to
connect to the cluster? I thought that all I needed was a custom
hadoop-site.xml placed in the classpath but it doesn't work.

 

Thanks,

Sebastien

RE: cluster startup problem

Reply via email to