>> However, if you were using Apache Hadoop 0.20.203 or 0.20.204 (or upcoming 
>> 0.20.205 with security + append) you would still see this behaviour because 
>> you are hitting 'user >>limits' where the CS will not allow a single user to 
>> take more than the queue 'configured' capacity (12 slots here). You will 
>> need more than one user in the 'orange' queue  to go over >>the queue's 
>> capacity. This is to prevent a single user from hogging the system's 
>> resources.

>> If you really want one user to acquire more resources in 'orange' queue, you 
>> need to tweak mapred.capacity-scheduler.queue.orange.user-limit-factor.

Arun, you're the man!!!
It is exactly solve my issue.
submitting jobs by another user allowed the queue burst pass the capacity.
In my settings, at this point we have only one user for all which
definitely user-limit-factor does work!!

-------------
Map tasks
Capacity: 12 slots
Maximum capacity: 32 slots
Used capacity: 16 (133.3% of Capacity) <------
Running tasks: 16
Active users:
User 'apps': 16 (100.0% of used capacity)
-------------

This is the configuration for orange queue.
<!-- Queue: orange -->
  <property>
    <name>mapred.capacity-scheduler.queue.orange.capacity</name>
    <value>40</value>
  </property>
  <property>
    <name>mapred.capacity-scheduler.queue.orange.maximum-capacity</name>
    <value>100</value>
  </property>
  <property>
    <name>mapred.capacity-scheduler.queue.orange.supports-priority</name>
    <value>true</value>
  </property>
  <property>
    <name>mapred.capacity-scheduler.queue.orange.user-limit-factor</name>
    <value>2</value>
  </property>

---------------------------------------------------

in CDH3u0, it supports CS, but

One interesting and sad part that i want to mention here.

This is the link that I followed from cdh web site.

http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u0/capacity_scheduler.html

it doesn't mention about user-limit-factor in the page at all.


>>  this way you gain better understanding of the system and we, the project, 
>> will hopefully gain another valuable contributor... hint, hint. ;-)
;-).. got the hint.
As unix sysadmin, pretty much 0 on java coding...lol but not 0 in php/perl;
what i can do to contribute... how can i start ..?

Cheers,
-P



On Sun, Oct 16, 2011 at 8:46 AM, Arun C Murthy <[email protected]> wrote:
> You are welcome. *smile*
>
> One of the greatest advantages of open-src s/w is that you can look at the 
> code while scratching your head in the corner - this way you gain better 
> understanding of the system and we, the project, will hopefully gain another 
> valuable contributor... hint, hint. ;-)
>
> Good luck.
>
> Arun
>
> On Oct 16, 2011, at 1:27 AM, patrick sang wrote:
>
>> Hi Arun,
>>
>> Your answer sheds extra bright light while I am scratching head in the 
>> corner.
>> 1 million thanks for answer and document. I will post back the result.
>>
>> Thanks again,
>> P
>>
>> On Sat, Oct 15, 2011 at 10:32 PM, Arun C Murthy <[email protected]> wrote:
>>>
>>> Hi Patrick,
>>>
>>> It's hard to diagnose CDH since I don't know what patch-sets they have for 
>>> the CapacityScheduler - afaik they only support FairScheduler, but that 
>>> might have changed.
>>>
>>> On Oct 15, 2011, at 4:45 PM, patrick sang wrote:
>>>
>>>> 4. from webUI, scheduling  information of orange queue.
>>>>
>>>> It said "Used capacity: 12 (100.0% of Capacity)"
>>>> while next line said "Maximum capacity: 16 slots"
>>>> So what's going on with other 4 slots ? why they are not get used.
>>>>
>>>> Is capacity-scheduler supposed to start using extra slots until it hit the
>>>> Max capacity ?
>>>> (from the variable of
>>>> mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity)
>>>> (there are no other jobs at all in the cluster)
>>>>
>>>> I am really thankful for reading up to this point.
>>>> Truly hope someone can shed some light on this.
>>>>
>>>
>>> However, if you were using Apache Hadoop 0.20.203 or 0.20.204 (or upcoming 
>>> 0.20.205 with security + append) you would still see this behaviour because 
>>> you are hitting 'user limits' where the CS will not allow a single user to 
>>> take more than the queue 'configured' capacity (12 slots here). You will 
>>> need more than one user in the 'orange' queue  to go over the queue's 
>>> capacity. This is to prevent a single user from hogging the system's 
>>> resources.
>>>
>>> If you really want one user to acquire more resources in 'orange' queue, 
>>> you need to tweak mapred.capacity-scheduler.queue.orange.user-limit-factor.
>>>
>>> More details here:
>>> http://hadoop.apache.org/common/docs/stable/capacity_scheduler.html
>>>
>>> Arun
>>>
>
>

Reply via email to