Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-29 Thread Renaud Delbru
Hi Allen, thanks for pointing this out. On 28/01/11 17:34, Allen Wittenauer wrote: As it seems that the capacity and fair schedulers in hadoop 0.20.2 do not allow a hard upper limit in number of concurrent tasks, do anybody know any other solutions to achieve this ? The specific change for

Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-28 Thread Allen Wittenauer
On Jan 25, 2011, at 12:48 PM, Renaud Delbru wrote: As it seems that the capacity and fair schedulers in hadoop 0.20.2 do not allow a hard upper limit in number of concurrent tasks, do anybody know any other solutions to achieve this ? The specific change for capacity scheduler has been

Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-27 Thread Renaud Delbru
Hi Koji, thanks for sharing the information, Is the 0.20-security branch planned to be a official release at some point ? Cheers -- Renaud Delbru On 27/01/11 01:50, Koji Noguchi wrote: Hi Renaud, Hopefully it’ll be in 0.20-security branch that Arun is trying to push. Related (very abstract)

Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-27 Thread Steve Loughran
On 27/01/11 10:51, Renaud Delbru wrote: Hi Koji, thanks for sharing the information, Is the 0.20-security branch planned to be a official release at some point ? Cheers If you can play with the beta you can see that it works for you and if not, get bugs fixed during the beta cycle

Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-27 Thread Renaud Delbru
Thanks, we will try to test it next week. -- Renaud Delbru On 27/01/11 11:31, Steve Loughran wrote: On 27/01/11 10:51, Renaud Delbru wrote: Hi Koji, thanks for sharing the information, Is the 0.20-security branch planned to be a official release at some point ? Cheers If you can play with

Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-25 Thread Renaud Delbru
Hi, we would like to limit the number of maximum tasks per job on our hadoop 0.20.2 cluster. Is the Capacity Scheduler [1] will allow to do this ? Is it correctly working on hadoop 0.20.2 (I remember a few months ago, we were looking at it, but it seemed incompatible with hadoop 0.20.2).

Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-25 Thread Harsh J
Capacity Scheduler (or a version of it) does ship with the 0.20 release of Hadoop and is usable. It can be used to assign queues with a limited capacity for each, which your jobs must appropriately submit to if you want them to utilize only the assigned fraction of your cluster for its processing.

Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-25 Thread Renaud Delbru
Our experience with the Capacity Scheduler was not what we expected and what you describe. But, it might be due to a wrong comprehension of the configuration parameters. The problem is the following: mapred.capacity-scheduler.queue.queue-name.capacity: Percentage of the number of slots in the

Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-25 Thread Harsh J
No, that is right. I did not assume that it was a very strict slot limit you were looking to impose for your jobs. On Tue, Jan 25, 2011 at 9:27 PM, Renaud Delbru renaud.del...@deri.org wrote: Our experience with the Capacity Scheduler was not what we expected and what you describe. But, it

Re: Best way to limit the number of concurrent tasks per job on hadoop 0.20.2

2011-01-25 Thread Renaud Delbru
As it seems that the capacity and fair schedulers in hadoop 0.20.2 do not allow a hard upper limit in number of concurrent tasks, do anybody know any other solutions to achieve this ? -- Renaud Delbru On 25/01/11 11:49, Renaud Delbru wrote: Hi, we would like to limit the number of maximum