if you increase the rate of TT heartbeating to the Job Tracker, they may pick up work more often.
The JT only hands out work when either of -the TT reports a task completion -the TT heartbeats in This is a design that scales well for large clusters, but can add startup latency for small ones steve On 30 August 2012 02:20, Terry Healy <[email protected]> wrote: > Thanks guys. Unfortunately I had started the datanode by local command > rather than from start-all.sh, so the related parts of the logs were > lost. I was watching the cpu loads on all 8 cores via gkrellm at the > time and they were definitely quiet. After a few minutes the jobs seemed > to get in sync and it ran under a reasonable load (i.e. all cores mostly > busy, with only brief gaps between tasks) for the rest of the job. > > I will attempt to re-create tomorrow with proper logging. I will look > into enabling Hadoop metrics. > > -Terry > > > > On 8/29/12 8:14 PM, Vinod Kumar Vavilapalli wrote: > > Do you know if you have enough job-load on the system? One way to look > at this is to look for running map/reduce tasks on the JT UI at the same > time you are looking at the node's cpu usage. > > > > Collecting hadoop metrics via a metrics collection system say ganglia > will let you match up the timestamps of idleness on the nodes with the > job-load at that point of time. > > > > HTH, > > +vinod > > > > On Aug 29, 2012, at 6:40 AM, Terry Healy wrote: > > > >> Running 1.0.2, in this case on Linux. > >> > >> I was watching the processes / loads on one TaskTracker instance and > >> noticed that it completed it's first 8 map tasks and reported 8 free > >> slots (the max for this system). It then waited doing nothing for more > >> than 30 seconds before the next "batch" of work came in and started > running. > >> > >> Likewise it also has relatively long periods with all 8 cores running at > >> or near idle. There are no jobs failing or obvious errors in the > >> TaskTracker log. > >> > >> What could be causing this? > >> > >> Should I increase the number of map jobs to greater than number of cores > >> to try and keep it busier? > >> > >> -Terry > > -- > Terry Healy / [email protected] > Cyber Security Operations > Brookhaven National Laboratory > Building 515, Upton N.Y. 11973 > > > >
