good thread. cleared some of my confusions about scheduler as well, ;) thx!
On Wed, Oct 19, 2011 at 5:03 PM, patrick sang <[email protected]>wrote: > >> However, if you were using Apache Hadoop 0.20.203 or 0.20.204 (or > upcoming 0.20.205 with security + append) you would still see this behaviour > because you are hitting 'user >>limits' where the CS will not allow a single > user to take more than the queue 'configured' capacity (12 slots here). You > will need more than one user in the 'orange' queue to go over >>the queue's > capacity. This is to prevent a single user from hogging the system's > resources. > > >> If you really want one user to acquire more resources in 'orange' queue, > you need to tweak mapred.capacity-scheduler.queue.orange.user-limit-factor. > > Arun, you're the man!!! > It is exactly solve my issue. > submitting jobs by another user allowed the queue burst pass the capacity. > In my settings, at this point we have only one user for all which > definitely user-limit-factor does work!! > > ------------- > Map tasks > Capacity: 12 slots > Maximum capacity: 32 slots > Used capacity: 16 (133.3% of Capacity) <------ > Running tasks: 16 > Active users: > User 'apps': 16 (100.0% of used capacity) > ------------- > > This is the configuration for orange queue. > <!-- Queue: orange --> > <property> > <name>mapred.capacity-scheduler.queue.orange.capacity</name> > <value>40</value> > </property> > <property> > <name>mapred.capacity-scheduler.queue.orange.maximum-capacity</name> > <value>100</value> > </property> > <property> > <name>mapred.capacity-scheduler.queue.orange.supports-priority</name> > <value>true</value> > </property> > <property> > <name>mapred.capacity-scheduler.queue.orange.user-limit-factor</name> > <value>2</value> > </property> > > --------------------------------------------------- > > in CDH3u0, it supports CS, but > > One interesting and sad part that i want to mention here. > > This is the link that I followed from cdh web site. > > > http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u0/capacity_scheduler.html > > it doesn't mention about user-limit-factor in the page at all. > > > >> this way you gain better understanding of the system and we, the > project, will hopefully gain another valuable contributor... hint, hint. ;-) > ;-).. got the hint. > As unix sysadmin, pretty much 0 on java coding...lol but not 0 in php/perl; > what i can do to contribute... how can i start ..? > > Cheers, > -P > > > > On Sun, Oct 16, 2011 at 8:46 AM, Arun C Murthy <[email protected]> > wrote: > > You are welcome. *smile* > > > > One of the greatest advantages of open-src s/w is that you can look at > the code while scratching your head in the corner - this way you gain better > understanding of the system and we, the project, will hopefully gain another > valuable contributor... hint, hint. ;-) > > > > Good luck. > > > > Arun > > > > On Oct 16, 2011, at 1:27 AM, patrick sang wrote: > > > >> Hi Arun, > >> > >> Your answer sheds extra bright light while I am scratching head in the > corner. > >> 1 million thanks for answer and document. I will post back the result. > >> > >> Thanks again, > >> P > >> > >> On Sat, Oct 15, 2011 at 10:32 PM, Arun C Murthy <[email protected]> > wrote: > >>> > >>> Hi Patrick, > >>> > >>> It's hard to diagnose CDH since I don't know what patch-sets they have > for the CapacityScheduler - afaik they only support FairScheduler, but that > might have changed. > >>> > >>> On Oct 15, 2011, at 4:45 PM, patrick sang wrote: > >>> > >>>> 4. from webUI, scheduling information of orange queue. > >>>> > >>>> It said "Used capacity: 12 (100.0% of Capacity)" > >>>> while next line said "Maximum capacity: 16 slots" > >>>> So what's going on with other 4 slots ? why they are not get used. > >>>> > >>>> Is capacity-scheduler supposed to start using extra slots until it hit > the > >>>> Max capacity ? > >>>> (from the variable of > >>>> mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity) > >>>> (there are no other jobs at all in the cluster) > >>>> > >>>> I am really thankful for reading up to this point. > >>>> Truly hope someone can shed some light on this. > >>>> > >>> > >>> However, if you were using Apache Hadoop 0.20.203 or 0.20.204 (or > upcoming 0.20.205 with security + append) you would still see this behaviour > because you are hitting 'user limits' where the CS will not allow a single > user to take more than the queue 'configured' capacity (12 slots here). You > will need more than one user in the 'orange' queue to go over the queue's > capacity. This is to prevent a single user from hogging the system's > resources. > >>> > >>> If you really want one user to acquire more resources in 'orange' > queue, you need to tweak > mapred.capacity-scheduler.queue.orange.user-limit-factor. > >>> > >>> More details here: > >>> http://hadoop.apache.org/common/docs/stable/capacity_scheduler.html > >>> > >>> Arun > >>> > > > > >
