Paul Sutter wrote:
First, It matters in the case of concurrent jobs. If you submit a 20 minute job while a 20 hour job is running, it would be nice if the reducers for the 20 minute job could get a chance to run before the 20 hour job's mappers have all finished. So even without a throughput improvement, you have an important capability (although it may require another minor tweak or two to make possible).
I fear that more than a minor tweak or two are required to make concurrent jobs work well. For example, you would also want to make sure that the long-running job does not consume all of the reduce slots, or the short job would again get stuck behind it. Pausing long-running tasks might be required.
The best way to do this at present is to run two job trackers, and two tasktrackers per node, then submit long-runnning jobs to one "cluster" and short-running jobs to the other.
Secondarily, we often have stragglers, where one mapper runs slower than the others. When this happens, we end up with a largely idle cluster for as long as an hour. In cases like these, good support for concurrent jobs _would_ improve throughput.
Can you perhaps increase the number of map tasks, so that even a slow task takes only a very small portion of the total execution time?
Good support for concurrent jobs would be great to have, and I'd love to see a patch that addresses this issue comprehensively. I am not convinced that it is worth making minor tweaks that may-or-may-not really help us to get there.
Doug
