reduce slots )

Abhishek Pratap Singh Mon, 14 May 2012 12:04:10 -0700

Hi JD,

Number of reduce task will depend upon the key after all the mapper is
done. if the key is same than all the data will go to one node, similarly
utilization of all nodes of cluster will depend upon the number of
different keys for reduce task.



Regards,
Abhishek

On Fri, May 11, 2012 at 4:57 PM, Jeremy Davis
<jda...@upstreamsoftware.com>wrote:

>
> I see mapred.tasktracker.reduce.tasks.maximum and
> mapred.tasktracker.map.tasks.maximum, but I'm wondering if there isn't
> another tuning parameter I need to look at.
>
> I can tune the task tracker so that when I have many jobs running, with
> many simultaneous maps and reduces I utilize 95% of cpu and memory.
>
> Inevitably though I end up with a huge final reduce task that only uses
> half of of my cluster because I have reserved the other half for Mapping.
>
> Is there a way around this problem?
>
> Seems like there should also be a maximum number of reducers conditional
> on no Map tasks running.
>
> -JD

Re: Resource underutilization / final reduce tasks only uses half of cluster ( tasktracker map/reduce slots )

Reply via email to