Hi,

"*mapred.job.reduce* "-  number of reduce (map) tasks your job will has is
depends on *mapred.tasktracker.reduce.tasks.maximum*( many reduce slot(s)
you can have on each tasktracker, which decide number of total number
reducer slots) property,
recommendation for setting slightly fewer reducers than total slots becuase
of tolertaes the few reduce failure without extending the job execution
time. If allocate more or equal number of reduce to availabel slot, if any
reduce task fails then job tracke has to wait to to resubmit this failed
taks to some other node, because may all reduce slot utailized at that
time. In this case job execution time will be extends to complete   late
resubmitted job.





On Mon, Apr 22, 2013 at 11:33 PM, Karthik Kambatla <[email protected]>wrote:

> I wonder how accurate that is.
>
> However, by setting the number of reducers slightly lesser than the reduce
> slots, the difference acts as headroom for speculative reduce tasks. And,
> the goal of a single wave is also preserved.
>
>
> On Mon, Apr 22, 2013 at 11:10 PM, Darpan R <[email protected]> wrote:
>
> > Hi guys,
> >  I read somewhere that for better performance
> >
> > For maximum performance, the number of reducers should be slightly less
> > than
> > the number of reduce slots in the cluster. This allows the reducers to
> > finish in
> > one wave and fully utilizes the cluster during the reduce phase.
> >
> > I don't quite understand this, Can you please help me understand?
> >
> > Thank you.
> >
>



-- 

Regards,
.....  Sudhakara.st

Reply via email to