RE: is a monolithic reduce task the right model?

2008-01-13 Thread Devaraj Das
disk usage). -Original Message- From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 11:16 PM To: hadoop-user@lucene.apache.org Subject: is a monolithic reduce task the right model? in thinking about Aaron's use case and our own problems with fair

Re: is a monolithic reduce task the right model?

2008-01-10 Thread Doug Cutting
Joydeep Sen Sarma wrote: - what if current reduce tasks were broken into separate copy, sort and reduce tasks? we would get much smaller units of recovery and scheduling. thoughts? If copy, sort and reduce are not scheduled together then it would be very hard to ensure they run on the same

Re: is a monolithic reduce task the right model?

2008-01-10 Thread Ted Dunning
Actually, all of my jobs tend to have one of these phases dominate the time. It isn't always the same phase that dominates, though, so the consideration isn't simple. The fact (if it is a fact) that one phase or another dominates means, however, that splitting them won't help much. On 1/10/08