In hadoop-site.xml on your namenode, what is the value of fs.default.name? It should be set to the fully-qualified domain name of the host.
On 5/13/07 11:40 PM, "Cedric Ho" <[EMAIL PROTECTED]> wrote: > I performed more testing on this. While the namenode is running, I > cannot connect to 192.168.1.179:9000 from other machines, but I can > connect to it locally. It seems that the serverSocket only bind to the > 127.0.0.1:9000 but not 192.168.1.179:9000. > > I've also confirmed that there's no firewall, connection bocking etc > on this machine. In fact I've written a small Java program that open a > serverSocket on 9000, started with the same user on the same machine. > And I am able to connect to it from all other machines. > > So is there some settings that will cause the namenode to only bind to > the 9000 port on the local interface ? > > Cedric > > > On 5/12/07, Michael Bieniosek <[EMAIL PROTECTED]> wrote: >> I would try to debug this as a network problem - when the namenode is >> running, can you connect to 192.168.1.179:9000 from the machine the datanode >> is on? >> >> While the namenode does use a lot of RAM as the cluster size increases, an >> overloaded namenode will typically start panicking in its log messages. >> This doesn't occur in your namenode logs - it doesn't appear any datanodes >> connected at all. >> >> -Michael >> >> On 5/10/07 7:39 PM, "Cedric Ho" <[EMAIL PROTECTED]> wrote: >> >>> Hi all, >>> >>> We were trying to setup hadoop in our linux environments. When we >>> tried to use a slow machine as the namenode (some Pentium III machine >>> with 512Mb ram). It seems that it was unable to accept connection from >>> other datanodes. (I can access its status from http at port 50070 >>> however). >>> >>> But it works fine on a faster machine (Pentium4 3Ghz with 3Gb ram). >>> The settings, etc are exactly the same. >>> >>> The problem seems to be on the namenode. Is it because the machine is slow ? >>> >>> The version we use is 0.12.3 >>> >>> Any help is appreciated. >>> >>> >>> Following is the log from the abnormal namenode. >>> >>> 2007-05-09 18:18:46,998 INFO org.apache.hadoop.dfs.StateChange: STATE* >>> Network topology has 0 racks and 0 datanodes >>> 2007-05-09 18:18:47,000 INFO org.apache.hadoop.dfs.StateChange: STATE* >>> UnderReplicatedBlocks has 0 blocks >>> 2007-05-09 18:18:47,432 INFO org.mortbay.util.Credential: Checking >>> Resource aliases >>> 2007-05-09 18:18:48,051 INFO org.mortbay.http.HttpServer: Version >>> Jetty/5.1.4 >>> 2007-05-09 18:18:50,524 INFO org.mortbay.util.Container: Started >>> [EMAIL PROTECTED] >>> 2007-05-09 18:18:51,064 INFO org.mortbay.util.Container: Started >>> WebApplicationContext[/,/] >>> 2007-05-09 18:18:51,065 INFO org.mortbay.util.Container: Started >>> HttpContext[/logs,/logs] >>> 2007-05-09 18:18:51,065 INFO org.mortbay.util.Container: Started >>> HttpContext[/static,/static] >>> 2007-05-09 18:18:51,147 INFO org.mortbay.http.SocketListener: Started >>> SocketListener on 0.0.0.0:50070 >>> 2007-05-09 18:18:51,148 INFO org.mortbay.util.Container: Started >>> [EMAIL PROTECTED] >>> 2007-05-09 18:18:51,223 INFO org.apache.hadoop.ipc.Server: IPC Server >>> listener on 9000: starting >>> 2007-05-09 18:18:51,226 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 0 on 9000: starting >>> 2007-05-09 18:18:51,227 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 1 on 9000: starting >>> 2007-05-09 18:18:51,228 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 2 on 9000: starting >>> 2007-05-09 18:18:51,229 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 3 on 9000: starting >>> 2007-05-09 18:18:51,391 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 4 on 9000: starting >>> 2007-05-09 18:18:51,392 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 5 on 9000: starting >>> 2007-05-09 18:18:51,393 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 6 on 9000: starting >>> 2007-05-09 18:18:51,394 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 7 on 9000: starting >>> 2007-05-09 18:18:51,395 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 8 on 9000: starting >>> 2007-05-09 18:18:51,397 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 9 on 9000: starting >>> >>> >>> And these are from the datanode >>> >>> 2007-05-09 18:35:13,263 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 1 time(s). >>> 2007-05-09 18:35:14,266 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 2 time(s). >>> 2007-05-09 18:35:15,270 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 3 time(s). >>> 2007-05-09 18:35:16,274 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 4 time(s). >>> 2007-05-09 18:35:17,279 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 5 time(s). >>> 2007-05-09 18:35:18,283 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 6 time(s). >>> 2007-05-09 18:35:19,288 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 7 time(s). >>> 2007-05-09 18:35:20,293 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 8 time(s). >>> 2007-05-09 18:35:21,295 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 9 time(s). >>> 2007-05-09 18:35:22,298 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 10 time(s). >>> 2007-05-09 18:35:23,304 INFO org.apache.hadoop.ipc.RPC: Server at >>> hadoop01.ourcompany.com/192.168.1.179:9000 not available yet, Zzzzz... >>> 2007-05-09 18:35:24,308 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 1 time(s). >>> 2007-05-09 18:35:25,317 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 2 time(s). >>> 2007-05-09 18:35:26,322 INFO org.apache.hadoop.ipc.Client: Retrying >>> connect to server: hadoop01.ourcompany.com/192.168.1.179:9000. Already >>> tried 3 time(s). >>> >>> >>> Thanks, >>> Cedric >> >>
