W.P, Upping the reduce.tasks to a huge number just means that it will eventually spawn reducers = to (that huge number). You still only have slots for 360 so there is no real advantage, UNLESS you are running into OOM errors, which we’ve seen with higher re-use on the smaller number of reducers.
Anyhoo, someone else can chime in and correct me if I am off base. Does that make sense? Cheers James. On 2011-05-18, at 4:04 PM, W.P. McNeill wrote: > I'm using fair scheduler and JVM reuse. It is just plain a big job. > > I'm not using a combiner right now, but that's something to look at. > > What about bumping the mapred.reduce.tasks up to some huge number? I think > that shouldn't make a difference, but I'm hearing conflicting information on > this.
