disk usage).
-Original Message-
From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 10, 2008 11:16 PM
To: hadoop-user@lucene.apache.org
Subject: is a monolithic reduce task the right model?
in thinking about Aaron's use case and our own problems with
fair
Joydeep Sen Sarma wrote:
- what if current reduce tasks were broken into separate copy, sort and reduce
tasks?
we would get much smaller units of recovery and scheduling.
thoughts?
If copy, sort and reduce are not scheduled together then it would be
very hard to ensure they run on the same
Actually, all of my jobs tend to have one of these phases dominate the time.
It isn't always the same phase that dominates, though, so the consideration
isn't simple.
The fact (if it is a fact) that one phase or another dominates means,
however, that splitting them won't help much.
On 1/10/08