Hi Nick, How many hard drives your slaves has? RPM of those? How many mappers are run concurrently on a node?Did you turn off speculative execution? Have a look at disk i/o to see whether that is a bottleneck or not.
MR is disk I/O bound so if you only have one disk on slave and you are running 5 Mapper concurrently then the job will slow down. Thanks, Anil On Wed, Oct 24, 2012 at 9:18 AM, Kevin O'dell <[email protected]>wrote: > Nick, > > What versions are you using: > > HDFS > HBase > OS > On Oct 24, 2012 10:36 AM, "Nick maillard" < > [email protected]> > wrote: > > > Hello everyone > > > > Still looking in the issue. > > I have tried different tests and the results are surprising. > > If I put mapred.tasktracker.map.tasks.maximum: 28 > > I get a total of 84 tasks on my cluster and the process takes about 1h15 > > min > > each task taking up 1h10 minutes. The whole file being cut down in 80 > > tasks. > > > > If I put mapred.tasktracker.map.tasks.maximum: 3 > > I get a total of 6 tasks on my cluster and the process takes about the > same > > amount of time 1h20 still cutting down the whole file in 80 tasks, but > now > > of > > course each individual task only takes up a couple of minutes. > > > > It's like the overall importTSv must take 1h something and the duration > of > > the > > map tasks vary accordingly. > > > > There is definitly something I am doing wrong. > > > > > > > > > -- Thanks & Regards, Anil Gupta
