On Wed, May 9, 2012 at 10:52 PM, Raj Vishwanathan <rajv...@yahoo.com> wrote:
> The picture either too small or too pixelated for my eyes :-) > There should be a zoom option in the top right of the page that allows you to view it full size > > Can you login to the box and send the output of top? If the system is > unresponsive, it has to be something more than an unbalanced hdfs cluster, > methinks. > Sorry, I'm unable to login to the box, it's completely unresponsive. > > Raj > > > > >________________________________ > > From: Darrell Taylor <darrell.tay...@gmail.com> > >To: common-user@hadoop.apache.org; Raj Vishwanathan <rajv...@yahoo.com> > >Sent: Wednesday, May 9, 2012 2:40 PM > >Subject: Re: High load on datanode startup > > > >On Wed, May 9, 2012 at 10:23 PM, Raj Vishwanathan <rajv...@yahoo.com> > wrote: > > > >> When you say 'load', what do you mean? CPU load or something else? > >> > > > >I mean in the unix sense of load average, i.e. top would show a load of > >(currently) 376. > > > >Looking at Ganglia stats for the box it's not CPU load as such, the graphs > >shows actual CPU usage as 30%, but the number of running processes is > >simply growing in a linear manner - screen shot of ganglia page here : > > > > > https://picasaweb.google.com/lh/photo/Q0uFSzyLiriDuDnvyRUikXVR0iWwMibMfH0upnTwi28?feat=directlink > > > > > > > >> > >> Raj > >> > >> > >> > >> >________________________________ > >> > From: Darrell Taylor <darrell.tay...@gmail.com> > >> >To: common-user@hadoop.apache.org > >> >Sent: Wednesday, May 9, 2012 9:52 AM > >> >Subject: High load on datanode startup > >> > > >> >Hi, > >> > > >> >I wonder if someone could give some pointers with a problem I'm having? > >> > > >> >I have a 7 machine cluster setup for testing and we have been pouring > data > >> >into it for a week without issue, have learnt several thing along the > way > >> >and solved all the problems up to now by searching online, but now I'm > >> >stuck. One of the data nodes decided to have a load of 70+ this > morning, > >> >stopping datanode and tasktracker brought it back to normal, but every > >> time > >> >I start the datanode again the load shoots through the roof, and all I > get > >> >in the logs is : > >> > > >> >STARTUP_MSG: Starting DataNode > >> > > >> > > >> >STARTUP_MSG: host = pl464/10.20.16.64 > >> > > >> > > >> >STARTUP_MSG: args = [] > >> > > >> > > >> >STARTUP_MSG: version = 0.20.2-cdh3u3 > >> > > >> > > >> >STARTUP_MSG: build = > >> > >> > >file:///data/1/tmp/nightly_2012-03-20_13-13-48_3/hadoop-0.20-0.20.2+923.197-1~squeeze > >> >-************************************************************/ > >> > > >> > > >> >2012-05-09 16:12:05,925 INFO > >> >org.apache.hadoop.security.UserGroupInformation: JAAS Configuration > >> already > >> >set up for Hadoop, not re-installing. > >> > > >> >2012-05-09 16:12:06,139 INFO > >> >org.apache.hadoop.security.UserGroupInformation: JAAS Configuration > >> already > >> >set up for Hadoop, not re-installing. > >> > > >> >Nothing else. > >> > > >> >The load seems to max out only 1 of the CPUs, but the machine becomes > >> >*very* unresponsive > >> > > >> >Anybody got any pointers of things I can try? > >> > > >> >Thanks > >> >Darrell. > >> > > >> > > >> > > >> > > > > > > >