Hi Michale,

Please check:
http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html#Logging

Then see your master and slave logs. The current logs in your emails, as far
as I could deduce show that the connection is failing, but it is unclear
what is causing the connection to fail.
Thanks and Regards,
Sonal


On Tue, Mar 9, 2010 at 3:53 PM, jiang licht <[email protected]> wrote:

> Thanks Sonal. How to set that debug mode? Actually I set
> "dfs.namenode.logging.level" to "all". Please see my first and previous
> posts for error messages.
>
> Thanks,
>
> Michael
>
> --- On Tue, 3/9/10, Sonal Goyal <[email protected]> wrote:
>
> From: Sonal Goyal <[email protected]>
> Subject: Re: where does jobtracker get the IP and port of namenode?
> To: [email protected]
> Date: Tuesday, March 9, 2010, 4:01 AM
>
> Can you turn logging level to debug to see what the logs say?
>
> Thanks and Regards,
> Sonal
>
>
> On Tue, Mar 9, 2010 at 1:08 PM, jiang licht <[email protected]> wrote:
>
> > I guess my confusion is this:
> >
> > I point "fs.default.name" to hdfs:A:50001 in core-site.xml (A is IP
> > address). I assume when tasktracker starts, it should use A:50001 to
> contact
> > namenode. But actually, tasktracker log shows that it uses B which is IP
> > address of another network interface of the  namenode box and because the
> > tasktracker box cannot reach address B, the tasktracker simply retries
> > connection and finally fails to start. I read some source code in
> > org.apache.hadoop.hdfs.DistributedFileSystem.initialize and it seems to
> me
> > the namenode address is passed in earlier from what is specified in "
> > fs.default.name". Is this correct that the namenode address used here by
> > tasktracker comes from "fs.default.name" in core-site.xml or somehow
> there
> > is another step in which this value is changed? Could someone elaborate
> this
> > process how tasktracker resolves namenode and contacts it? Thanks!
> >
> > Thanks,
> >
> > Michael
> >
> > --- On Tue, 3/9/10, jiang licht <[email protected]> wrote:
> >
> > From: jiang licht <[email protected]>
> > Subject: Re: where does jobtracker get the IP and port of namenode?
> > To: [email protected]
> > Date: Tuesday, March 9, 2010, 12:20 AM
> >
> > Sorry, that was a typo in my first post. I did use 'fs.default.name' in
> > core-site.xml.
> >
> > BTW, the following is the list of error message when tasktracker was
> > started and shows that tasktracker failed to connect to namenode A:50001.
> >
> > /************************************************************
> > STARTUP_MSG: Starting TaskTracker
> > STARTUP_MSG:   host = HOSTNAME/127.0.0.1
> > STARTUP_MSG:   args = []
> > STARTUP_MSG:   version = 0.20.1+169.56
> > STARTUP_MSG:   build =  -r 8e662cb065be1c4bc61c55e6bff161e09c1d36f3;
> > compiled by 'root' on Tue Feb  9 13:40:08 EST 2010
> > ************************************************************/
> > 2010-03-09 00:08:50,199 INFO org.mortbay.log: Logging to
> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> > org.mortbay.log.Slf4jLog
> > 2010-03-09 00:08:50,341 INFO org.apache.hadoop.http.HttpServer: Port
> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
> -1.
> > Opening the listener on 50060
> > 2010-03-09 00:08:50,350 INFO org.apache.hadoop.http.HttpServer:
> > listener.getLocalPort() returned 50060
> > webServer.getConnectors()[0].getLocalPort() returned 50060
> > 2010-03-09 00:08:50,350 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound
> > to port 50060
> > 2010-03-09 00:08:50,350 INFO org.mortbay.log: jetty-6.1.14
> > 2010-03-09 00:08:50,707 INFO org.mortbay.log: Started
> > [email protected]:50060
> > 2010-03-09 00:08:50,734 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > Initializing JVM Metrics with processName=TaskTracker, sessionId=
> > 2010-03-09 00:08:50,749 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> > Initializing RPC Metrics with hostName=TaskTracker, port=52550
> > 2010-03-09 00:08:50,799 INFO org.apache.hadoop.ipc.Server: IPC Server
> > Responder: starting
> > 2010-03-09 00:08:50,800 INFO org.apache.hadoop.ipc.Server: IPC Server
> > listener on 52550: starting
> > 2010-03-09 00:08:50,800 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 0 on 52550: starting
> > 2010-03-09 00:08:50,800 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 1 on 52550: starting
> > 2010-03-09 00:08:50,801 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 2 on 52550: starting
> > 2010-03-09 00:08:50,801 INFO org.apache.hadoop.mapred.TaskTracker:
> > TaskTracker up at: HOSTNAME/127.0.0.1:52550
> > 2010-03-09 00:08:50,801 INFO org.apache.hadoop.mapred.TaskTracker:
> Starting
> > tracker tracker_HOSTNAME:HOSTNAME/127.0.0.1:52550
> > 2010-03-09 00:08:50,802 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 3 on 52550: starting
> > 2010-03-09 00:08:50,854 INFO org.apache.hadoop.mapred.TaskTracker:  Using
> > MemoryCalculatorPlugin :
> > org.apache.hadoop.util.linuxmemorycalculatorplu...@27b4c1d7
> > 2010-03-09 00:08:50,856 INFO org.apache.hadoop.mapred.TaskTracker:
> Starting
> > thread: Map-events fetcher for all reduce tasks on
> > tracker_HOSTNAME:HOSTNAME/127.0.0.1:52550
> > 2010-03-09 00:08:50,858 WARN org.apache.hadoop.mapred.TaskTracker:
> > TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
> > disabled.
> > 2010-03-09 00:08:50,859 INFO org.apache.hadoop.mapred.IndexCache:
> > IndexCache created with max memory = 10485760
> > 2010-03-09 00:09:11,970 INFO org.apache.hadoop.ipc.Client: Retrying
> connect
> > to server: /A:50001. Already tried 0 time(s).
> > 2010-03-09 00:09:32,972 INFO org.apache.hadoop.ipc.Client: Retrying
> connect
> > to server: /A:50001. Already tried 1 time(s).
> > ...
> >
> > Thanks,
> >
> > Michael
> >
> > --- On Mon, 3/8/10, Arun C Murthy <[email protected]> wrote:
> >
> > From: Arun C Murthy <[email protected]>
> > Subject: Re: where does jobtracker get the IP and port of namenode?
> > To: [email protected]
> > Date: Monday, March 8, 2010, 10:26 PM
> >
> > > Here's what is set in core-site.xml
> > >
> > > dfs.default.name=>hdfs://B:50001
> > >
> >
> > That should be 'fs.default.name' ...
> >
> > Arun
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>
>
>

Reply via email to