Hi Sonal, Thanks. I guess you are right. ps -ef exposes such processes. -bikash
On Tue, Mar 1, 2011 at 1:29 PM, Sonal Goyal <[email protected]> wrote: > Bikash, > > I have sometimes found hanging processes which jps does not report, but a > ps -ef shows them. Maybe you can check this on the errant nodes.. > > Thanks and Regards, > Sonal > <https://github.com/sonalgoyal/hiho>Hadoop ETL and Data > Integration<https://github.com/sonalgoyal/hiho> > Nube Technologies <http://www.nubetech.co> > > <http://in.linkedin.com/in/sonalgoyal> > > > > > > > On Tue, Mar 1, 2011 at 7:37 PM, bikash sharma <[email protected]>wrote: > >> Hi James, >> Sorry for the late response. No, the same problem persists. I reformatted >> HDFS, stopped mapred and hdfs daemons and restarted them (using >> start-dfs.sh >> and start-mapred.sh from master node). But surprisingly out of 4 nodes >> cluster, two nodes have TaskTracker running while other two do not have >> TaskTrackers on them (verified using jps). I guess since I have the Hadoop >> installed on shared storage, that might be the issue? Btw, how do I start >> the services independently on each node? >> >> -bikash >> On Sun, Feb 27, 2011 at 11:05 PM, James Seigel <[email protected]> wrote: >> >> > .... Did you get it working? What was the fix? >> > >> > Sent from my mobile. Please excuse the typos. >> > >> > On 2011-02-27, at 8:43 PM, Simon <[email protected]> wrote: >> > >> > > Hey Bikash, >> > > >> > > Maybe you can manually start a tasktracker on the node and see if >> there >> > are >> > > any error messages. Also, don't forget to check your configure files >> for >> > > mapreduce and hdfs and make sure datanode can start successfully >> first. >> > > After all these steps, you can submit a job on the master node and see >> if >> > > there are any communication between these failed nodes and the master >> > node. >> > > Post your error messages here if possible. >> > > >> > > HTH. >> > > Simon - >> > > >> > > On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma < >> [email protected] >> > >wrote: >> > > >> > >> Thanks James. Well all the config. files and shared keys are on a >> shared >> > >> storage that is accessed by all the nodes in the cluster. >> > >> At times, everything runs fine on initialization, but at other times, >> > the >> > >> same problem persists, so was bit confused. >> > >> Also, checked the TaskTracker logs on those nodes, there does not >> seem >> > to >> > >> be >> > >> any error. >> > >> >> > >> -bikash >> > >> >> > >> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <[email protected]> >> wrote: >> > >> >> > >>> Maybe your ssh keys aren’t distributed the same on each machine or >> the >> > >>> machines aren’t configured the same? >> > >>> >> > >>> J >> > >>> >> > >>> >> > >>> On 2011-02-26, at 8:25 AM, bikash sharma wrote: >> > >>> >> > >>>> Hi, >> > >>>> I have a 10 nodes Hadoop cluster, where I am running some >> benchmarks >> > >> for >> > >>>> experiments. >> > >>>> Surprisingly, when I initialize the Hadoop cluster >> > >>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes >> have >> > >>>> TaskTracker process up (seen using jps), while other nodes do not >> have >> > >>>> TaskTrackers. Could anyone please explain? >> > >>>> >> > >>>> Thanks, >> > >>>> Bikash >> > >>> >> > >>> >> > >> >> > > >> > > >> > > >> > > -- >> > > Regards, >> > > Simon >> > >> > >
