I'd love to give you tips, but you didn't provide any data about the input and output of your job, the kind of hardware you're using, etc. At this point any suggestion would be a stab in the dark, the best I can do is pointing to the existing documentation http://wiki.apache.org/hadoop/PerformanceTuning
J-D On Tue, Oct 5, 2010 at 7:12 PM, Venkatesh <[email protected]> wrote: > > > > I've a mapreduce job that is taking too long..over an hour..Trying to see > what can a tune > to to bring it down..One thing I noticed, the job is kicking off > - 500+ map tasks : 490 of them do not process any records..where as 10 of > them process all the records > (200 K each..)..Any idea why that would be?... > > ..map phase takes about couple of minutes.. > ..reduce phase takes the rest.. > > ..i'll try increasing # of reduce tasks..Open to other other suggestion for > tunables.. > > thanks for your input > venkatesh > > >
