sorry for my lately response.... It is the copy-speed after Completion of all Map process(1,8MB\s), during map process is this value 1,5-1,6MB\s.
2011/4/15 Harsh J <[email protected]> > Hello Juwei, > > On Fri, Apr 15, 2011 at 10:43 PM, Juwei Shi <[email protected]> wrote: > > Harsh, > > > > Do you know why reducers start one by one with serveral seconds' > interval? > > They do not start at the same time. For example, if we set the reduce > task > > capacity (max concurrent reduce tasks) to be 100, and the average run > time > > of a reduce task is 15 second. Althrough all map tasks are completed, > some > > reduce tasks are not initiated when the prior reduce tasks have already > > completed. Then the number of concurrent running reduce tasks will be > about > > 20 rather than 100. > > > > This may not be a problem because MapReduce is designed for high > throughput > > not low latency. But if I have some requirement to optimize the latency, > do > > you know how to control it? Either by tuning parameters or changing some > > source code such as heartbeat interval. > > Have a look at this thread: > http://search-hadoop.com/m/bYupFnX7FY1/number+of+tasks+assign+per+heartbeat > > It is basically to gain better spreading of reduce tasks across TT > hosts (improves network usage). You can try writing your own scheduler > and/or investigate alternative scheduler behaviors. > > -- > Harsh J >
