sorry for my lately response.... It is the copy-speed after Completion
of all Map process(1,8MB\s), during map process is this value 1,5-1,6MB\s.

2011/4/15 Harsh J <[email protected]>

> Hello Juwei,
>
> On Fri, Apr 15, 2011 at 10:43 PM, Juwei Shi <[email protected]> wrote:
> > Harsh,
> >
> > Do you know why reducers start one by one with serveral seconds'
> interval?
> > They do not start at the same time. For example, if we set the reduce
> task
> > capacity (max concurrent reduce tasks) to be 100, and the average run
> time
> > of a reduce task is 15 second. Althrough all map tasks are completed,
> some
> > reduce tasks are not initiated when the prior reduce tasks have already
> > completed. Then the number of concurrent running reduce tasks will be
> about
> > 20 rather than 100.
> >
> > This may not be a problem because MapReduce is designed for high
> throughput
> > not low latency. But if I have some requirement to optimize the latency,
> do
> > you know how to control it? Either by tuning parameters or changing some
> > source code such as heartbeat interval.
>
> Have a look at this thread:
> http://search-hadoop.com/m/bYupFnX7FY1/number+of+tasks+assign+per+heartbeat
>
> It is basically to gain better spreading of reduce tasks across TT
> hosts (improves network usage). You can try writing your own scheduler
> and/or investigate alternative scheduler behaviors.
>
> --
> Harsh J
>

Reply via email to