You can check the logs whose tasktracker isn't up.
The path is "HADOOP_HOME/logs/".
The answer may be in it.

2011/3/2 bikash sharma <[email protected]>

> Hi Sonal,
> Thanks. I guess you are right. ps -ef exposes such processes.
>
> -bikash
>
> On Tue, Mar 1, 2011 at 1:29 PM, Sonal Goyal <[email protected]> wrote:
>
> > Bikash,
> >
> > I have sometimes found hanging processes which jps does not report, but a
> > ps -ef shows them. Maybe you can check this on the errant nodes..
> >
> > Thanks and Regards,
> > Sonal
> > <https://github.com/sonalgoyal/hiho>Hadoop ETL and Data Integration<
> https://github.com/sonalgoyal/hiho>
> > Nube Technologies <http://www.nubetech.co>
> >
> > <http://in.linkedin.com/in/sonalgoyal>
> >
> >
> >
> >
> >
> >
> > On Tue, Mar 1, 2011 at 7:37 PM, bikash sharma <[email protected]
> >wrote:
> >
> >> Hi James,
> >> Sorry for the late response. No, the same problem persists. I
> reformatted
> >> HDFS, stopped mapred and hdfs daemons and restarted them (using
> >> start-dfs.sh
> >> and start-mapred.sh from master node). But surprisingly out of 4 nodes
> >> cluster, two nodes have TaskTracker running while other two do not have
> >> TaskTrackers on them (verified using jps). I guess since I have the
> Hadoop
> >> installed on shared storage, that might be the issue? Btw, how do I
> start
> >> the services independently on each node?
> >>
> >> -bikash
> >> On Sun, Feb 27, 2011 at 11:05 PM, James Seigel <[email protected]> wrote:
> >>
> >> > .... Did you get it working?  What was the fix?
> >> >
> >> > Sent from my mobile. Please excuse the typos.
> >> >
> >> > On 2011-02-27, at 8:43 PM, Simon <[email protected]> wrote:
> >> >
> >> > > Hey Bikash,
> >> > >
> >> > > Maybe you can manually start a  tasktracker on the node and see if
> >> there
> >> > are
> >> > > any error messages. Also, don't forget to check your configure files
> >> for
> >> > > mapreduce and hdfs and make sure datanode can start successfully
> >> first.
> >> > > After all these steps, you can submit a job on the master node and
> see
> >> if
> >> > > there are any communication between these failed nodes and the
> master
> >> > node.
> >> > > Post your error messages here if possible.
> >> > >
> >> > > HTH.
> >> > > Simon -
> >> > >
> >> > > On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <
> >> [email protected]
> >> > >wrote:
> >> > >
> >> > >> Thanks James. Well all the config. files and shared keys are on a
> >> shared
> >> > >> storage that is accessed by all the nodes in the cluster.
> >> > >> At times, everything runs fine on initialization, but at other
> times,
> >> > the
> >> > >> same problem persists, so was bit confused.
> >> > >> Also, checked the TaskTracker logs on those nodes, there does not
> >> seem
> >> > to
> >> > >> be
> >> > >> any error.
> >> > >>
> >> > >> -bikash
> >> > >>
> >> > >> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <[email protected]>
> >> wrote:
> >> > >>
> >> > >>> Maybe your ssh keys aren’t distributed the same on each machine or
> >> the
> >> > >>> machines aren’t configured the same?
> >> > >>>
> >> > >>> J
> >> > >>>
> >> > >>>
> >> > >>> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
> >> > >>>
> >> > >>>> Hi,
> >> > >>>> I have a 10 nodes Hadoop cluster, where I am running some
> >> benchmarks
> >> > >> for
> >> > >>>> experiments.
> >> > >>>> Surprisingly, when I initialize the Hadoop cluster
> >> > >>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes
> >> have
> >> > >>>> TaskTracker process up (seen using jps), while other nodes do not
> >> have
> >> > >>>> TaskTrackers. Could anyone please explain?
> >> > >>>>
> >> > >>>> Thanks,
> >> > >>>> Bikash
> >> > >>>
> >> > >>>
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Regards,
> >> > > Simon
> >> >
> >>
> >
> >
>

Reply via email to