Hi JD, Number of reduce task will depend upon the key after all the mapper is done. if the key is same than all the data will go to one node, similarly utilization of all nodes of cluster will depend upon the number of different keys for reduce task.
Regards, Abhishek On Fri, May 11, 2012 at 4:57 PM, Jeremy Davis <jda...@upstreamsoftware.com>wrote: > > I see mapred.tasktracker.reduce.tasks.maximum and > mapred.tasktracker.map.tasks.maximum, but I'm wondering if there isn't > another tuning parameter I need to look at. > > I can tune the task tracker so that when I have many jobs running, with > many simultaneous maps and reduces I utilize 95% of cpu and memory. > > Inevitably though I end up with a huge final reduce task that only uses > half of of my cluster because I have reserved the other half for Mapping. > > Is there a way around this problem? > > Seems like there should also be a maximum number of reducers conditional > on no Map tasks running. > > -JD