Hi Michale, Please check: http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html#Logging
Then see your master and slave logs. The current logs in your emails, as far as I could deduce show that the connection is failing, but it is unclear what is causing the connection to fail. Thanks and Regards, Sonal On Tue, Mar 9, 2010 at 3:53 PM, jiang licht <[email protected]> wrote: > Thanks Sonal. How to set that debug mode? Actually I set > "dfs.namenode.logging.level" to "all". Please see my first and previous > posts for error messages. > > Thanks, > > Michael > > --- On Tue, 3/9/10, Sonal Goyal <[email protected]> wrote: > > From: Sonal Goyal <[email protected]> > Subject: Re: where does jobtracker get the IP and port of namenode? > To: [email protected] > Date: Tuesday, March 9, 2010, 4:01 AM > > Can you turn logging level to debug to see what the logs say? > > Thanks and Regards, > Sonal > > > On Tue, Mar 9, 2010 at 1:08 PM, jiang licht <[email protected]> wrote: > > > I guess my confusion is this: > > > > I point "fs.default.name" to hdfs:A:50001 in core-site.xml (A is IP > > address). I assume when tasktracker starts, it should use A:50001 to > contact > > namenode. But actually, tasktracker log shows that it uses B which is IP > > address of another network interface of the namenode box and because the > > tasktracker box cannot reach address B, the tasktracker simply retries > > connection and finally fails to start. I read some source code in > > org.apache.hadoop.hdfs.DistributedFileSystem.initialize and it seems to > me > > the namenode address is passed in earlier from what is specified in " > > fs.default.name". Is this correct that the namenode address used here by > > tasktracker comes from "fs.default.name" in core-site.xml or somehow > there > > is another step in which this value is changed? Could someone elaborate > this > > process how tasktracker resolves namenode and contacts it? Thanks! > > > > Thanks, > > > > Michael > > > > --- On Tue, 3/9/10, jiang licht <[email protected]> wrote: > > > > From: jiang licht <[email protected]> > > Subject: Re: where does jobtracker get the IP and port of namenode? > > To: [email protected] > > Date: Tuesday, March 9, 2010, 12:20 AM > > > > Sorry, that was a typo in my first post. I did use 'fs.default.name' in > > core-site.xml. > > > > BTW, the following is the list of error message when tasktracker was > > started and shows that tasktracker failed to connect to namenode A:50001. > > > > /************************************************************ > > STARTUP_MSG: Starting TaskTracker > > STARTUP_MSG: host = HOSTNAME/127.0.0.1 > > STARTUP_MSG: args = [] > > STARTUP_MSG: version = 0.20.1+169.56 > > STARTUP_MSG: build = -r 8e662cb065be1c4bc61c55e6bff161e09c1d36f3; > > compiled by 'root' on Tue Feb 9 13:40:08 EST 2010 > > ************************************************************/ > > 2010-03-09 00:08:50,199 INFO org.mortbay.log: Logging to > > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > > org.mortbay.log.Slf4jLog > > 2010-03-09 00:08:50,341 INFO org.apache.hadoop.http.HttpServer: Port > > returned by webServer.getConnectors()[0].getLocalPort() before open() is > -1. > > Opening the listener on 50060 > > 2010-03-09 00:08:50,350 INFO org.apache.hadoop.http.HttpServer: > > listener.getLocalPort() returned 50060 > > webServer.getConnectors()[0].getLocalPort() returned 50060 > > 2010-03-09 00:08:50,350 INFO org.apache.hadoop.http.HttpServer: Jetty > bound > > to port 50060 > > 2010-03-09 00:08:50,350 INFO org.mortbay.log: jetty-6.1.14 > > 2010-03-09 00:08:50,707 INFO org.mortbay.log: Started > > [email protected]:50060 > > 2010-03-09 00:08:50,734 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > > Initializing JVM Metrics with processName=TaskTracker, sessionId= > > 2010-03-09 00:08:50,749 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: > > Initializing RPC Metrics with hostName=TaskTracker, port=52550 > > 2010-03-09 00:08:50,799 INFO org.apache.hadoop.ipc.Server: IPC Server > > Responder: starting > > 2010-03-09 00:08:50,800 INFO org.apache.hadoop.ipc.Server: IPC Server > > listener on 52550: starting > > 2010-03-09 00:08:50,800 INFO org.apache.hadoop.ipc.Server: IPC Server > > handler 0 on 52550: starting > > 2010-03-09 00:08:50,800 INFO org.apache.hadoop.ipc.Server: IPC Server > > handler 1 on 52550: starting > > 2010-03-09 00:08:50,801 INFO org.apache.hadoop.ipc.Server: IPC Server > > handler 2 on 52550: starting > > 2010-03-09 00:08:50,801 INFO org.apache.hadoop.mapred.TaskTracker: > > TaskTracker up at: HOSTNAME/127.0.0.1:52550 > > 2010-03-09 00:08:50,801 INFO org.apache.hadoop.mapred.TaskTracker: > Starting > > tracker tracker_HOSTNAME:HOSTNAME/127.0.0.1:52550 > > 2010-03-09 00:08:50,802 INFO org.apache.hadoop.ipc.Server: IPC Server > > handler 3 on 52550: starting > > 2010-03-09 00:08:50,854 INFO org.apache.hadoop.mapred.TaskTracker: Using > > MemoryCalculatorPlugin : > > org.apache.hadoop.util.linuxmemorycalculatorplu...@27b4c1d7 > > 2010-03-09 00:08:50,856 INFO org.apache.hadoop.mapred.TaskTracker: > Starting > > thread: Map-events fetcher for all reduce tasks on > > tracker_HOSTNAME:HOSTNAME/127.0.0.1:52550 > > 2010-03-09 00:08:50,858 WARN org.apache.hadoop.mapred.TaskTracker: > > TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is > > disabled. > > 2010-03-09 00:08:50,859 INFO org.apache.hadoop.mapred.IndexCache: > > IndexCache created with max memory = 10485760 > > 2010-03-09 00:09:11,970 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: /A:50001. Already tried 0 time(s). > > 2010-03-09 00:09:32,972 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: /A:50001. Already tried 1 time(s). > > ... > > > > Thanks, > > > > Michael > > > > --- On Mon, 3/8/10, Arun C Murthy <[email protected]> wrote: > > > > From: Arun C Murthy <[email protected]> > > Subject: Re: where does jobtracker get the IP and port of namenode? > > To: [email protected] > > Date: Monday, March 8, 2010, 10:26 PM > > > > > Here's what is set in core-site.xml > > > > > > dfs.default.name=>hdfs://B:50001 > > > > > > > That should be 'fs.default.name' ... > > > > Arun > > > > > > > > > > > > > > > > > > > > > > >
