If all your Apps are MR, then what you are looking for is MAPREDUCE-5583 (it can be set per-job).
On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <[email protected]> wrote: > Hi Naga, > > Like most of the app-level configurations, admin can configure the > defaults which user may want override at application level. > > If this is at queue-level then all applications in a queue will have the > same limits. But all our applications in a queue may not have same SLA and > we may need to restrict them differently. This requires again splitting > queues further which I feel is more overhead. > > > On 30 September 2015 at 09:00, Naganarasimha G R (Naga) < > [email protected]> wrote: > >> Hi Laxman, >> >> Ideally i understand it would be better its available @ application >> level, but its like each user is expected to ensure that he gives the >> right configuration which is within the limits of max capacity. >> And what if user submits some app *(kind of a query execution app**)* >> with out this setting *or* he doesn't know how much it should take ? In >> general, users specifying resources for containers itself is a difficult >> task. >> And it might not be right to expect that the admin will do it for each >> application in the queue either. Basically governing will be difficult if >> its not enforced from queue/scheduler side. >> >> + Naga >> >> ------------------------------ >> *From:* Laxman Ch [[email protected]] >> *Sent:* Tuesday, September 29, 2015 16:52 >> >> *To:* [email protected] >> *Subject:* Re: Concurrency control >> >> IMO, its better to have a application level configuration than to have a >> scheduler/queue level configuration. >> Having a queue level configuration will restrict every single application >> that runs in that queue. >> But, we may want to configure these limits for only some set of jobs and >> also for every application these limits can be different. >> >> FairOrdering policy thing, order of jobs can't be enforced as these are >> adhoc jobs and scheduled/owned independently by different teams. >> >> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) < >> [email protected]> wrote: >> >>> Hi Laxman, >>> >>> What i meant was, suppose if we support and configure >>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25 then a >>> single app should not take more than 25 % of resources in the queue. >>> This would be a more generic configuration which can be enforced by the >>> admin, than expecting it to be configured for per app by the user. >>> >>> And for Rohith's suggestion of FairOrdering policy , I think it should >>> solve the problem if the App which is submitted first is not already hogged >>> all the queue's resources. >>> >>> + Naga >>> >>> ------------------------------ >>> *From:* Laxman Ch [[email protected]] >>> *Sent:* Tuesday, September 29, 2015 16:03 >>> >>> *To:* [email protected] >>> *Subject:* Re: Concurrency control >>> >>> Thanks Rohit, Naga and Lloyd for the responses. >>> >>> > I think Laxman should also tell us more about which application type he >>> is running. >>> >>> We run mr jobs mostly with default core/memory allocation (1 vcore, >>> 1.5GB). >>> Our problem is more about controlling the * resources used >>> simultaneously by all running containers *at any given point of time >>> per application. >>> >>> Example: >>> 1. App1 and App2 are two MR apps. >>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB). >>> 3. Each App1 task takes 8 hrs for completion >>> 4. Each App2 task takes 5 mins for completion >>> 5. App1 triggered at time "t1" and using all the slots of queue. >>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot App1 >>> tasks to release the resources. >>> 7. We can't have preemption enabled as we don't want to lose the work >>> completed so far by App1. >>> 8. We can't have separate queues for App1 and App2 as we have lots of >>> jobs like this and it will explode the number of queues. >>> 9. We use CapacityScheduler. >>> >>> In this scenario, if I can control App1 concurrent usage limits to >>> 50vcores and 75GB, then App1 may take longer time to finish but there won't >>> be any starvation for App2 (and other jobs running in same queue) >>> >>> @Rohit, FairOrdering policy may not solve this starvation problem. >>> >>> @Naga, I couldn't think through the expected behavior of " >>> yarn.scheduler.capacity.<queue-path>.app-limit-factor" >>> I will revert on this. >>> >>> On 29 September 2015 at 14:57, Namikaze Minato <[email protected]> >>> wrote: >>> >>>> I think Laxman should also tell us more about which application type >>>> he is running. The normal use cas of MAPREDUCE should be working as >>>> intended, but if he has for example one MAP using 100 vcores, then the >>>> second map will have to wait until the app completes. Same would >>>> happen if the applications running were spark, as spark does not free >>>> what is allocated to it. >>>> >>>> Regards, >>>> LLoyd >>>> >>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga) >>>> <[email protected]> wrote: >>>> > Thanks Rohith for your thoughts , >>>> > But i think by this configuration it might not completely solve >>>> the >>>> > scenario mentioned by Laxman, As if the there is some time gap >>>> between first >>>> > and and the second app then though we have fairness or priority set >>>> for apps >>>> > starvation will be there. >>>> > IIUC we can think of an approach where in we can have something >>>> similar to >>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor" where in it >>>> can >>>> > provide the functionality like >>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The >>>> multiple of >>>> > the queue capacity which can be configured to allow a single app to >>>> acquire >>>> > more resources. Thoughts ? >>>> > >>>> > + Naga >>>> > >>>> > >>>> > >>>> > ________________________________ >>>> > From: Rohith Sharma K S [[email protected]] >>>> > Sent: Tuesday, September 29, 2015 14:07 >>>> > To: [email protected] >>>> > Subject: RE: Concurrency control >>>> > >>>> > Hi Laxman, >>>> > >>>> > >>>> > >>>> > In Hadoop-2.8(Not released yet), CapacityScheduler provides >>>> configuration >>>> > for configuring ordering policy. By configuring FAIR_ORDERING_POLICY >>>> in CS >>>> > , probably you should be able to achieve your goal i.e avoiding >>>> starving of >>>> > applications for resources. >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S> >>>> > >>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see >>>> > FairScheduler FairSharePolicy), generally, processes with lesser >>>> usage are >>>> > lesser. If sizedBasedWeight is set to true then an application with >>>> high >>>> > demand may be prioritized ahead of an application with less usage. >>>> This is >>>> > to offset the tendency to favor small apps, which could result in >>>> starvation >>>> > for large apps if many small ones enter and leave the queue >>>> continuously >>>> > (optional, default false) >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > Community Issue Id : https://issues.apache.org/jira/browse/YARN-3463 >>>> > >>>> > >>>> > >>>> > Thanks & Regards >>>> > >>>> > Rohith Sharma K S >>>> > >>>> > >>>> > >>>> > From: Laxman Ch [mailto:[email protected]] >>>> > Sent: 29 September 2015 13:36 >>>> > To: [email protected] >>>> > Subject: Re: Concurrency control >>>> > >>>> > >>>> > >>>> > Bouncing this thread again. Any other thoughts please? >>>> > >>>> > >>>> > >>>> > On 17 September 2015 at 23:21, Laxman Ch <[email protected]> >>>> wrote: >>>> > >>>> > No Naga. That wont help. >>>> > >>>> > >>>> > >>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores) >>>> with >>>> > same user which runs in same queue (capacity=100vcores). In this >>>> scenario, >>>> > if app1 triggers first occupies all the slots and runs longs then >>>> app2 will >>>> > starve longer. >>>> > >>>> > >>>> > >>>> > Let me reiterate my problem statement. I wanted "to control the >>>> amount of >>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY" >>>> > >>>> > >>>> > >>>> > On 17 September 2015 at 22:28, Naganarasimha Garla >>>> > <[email protected]> wrote: >>>> > >>>> > Hi Laxman, >>>> > >>>> > For the example you have stated may be we can do the following things >>>> : >>>> > >>>> > 1. Create/modify the queue with capacity and max cap set such that its >>>> > equivalent to 100 vcores. So as there is no elasticity, given >>>> application >>>> > will not be using the resources beyond the capacity configured >>>> > >>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent >>>> so that >>>> > each active user would be assured with the minimum guaranteed >>>> resources . By >>>> > default value is 100 implies no user limits are imposed. >>>> > >>>> > >>>> > >>>> > Additionally we can think of >>>> > >>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" >>>> > which will enforce strict cpu usage for a given container if required. >>>> > >>>> > >>>> > >>>> > + Naga >>>> > >>>> > >>>> > >>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <[email protected]> >>>> wrote: >>>> > >>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the >>>> resources >>>> > at container level. But my requirement is more about controlling the >>>> > concurrent resource usage of an application at whole cluster level. >>>> > >>>> > >>>> > >>>> > And yes, we do configure queues properly. But, that won't help. >>>> > >>>> > >>>> > >>>> > For example, I have an application with a requirement of 1000 vcores. >>>> But, I >>>> > wanted to control this application not to go beyond 100 vcores at any >>>> point >>>> > of time in the cluster/queue. This makes that application to run >>>> longer even >>>> > when my cluster is free but I will be able meet the guaranteed SLAs >>>> of other >>>> > applications. >>>> > >>>> > >>>> > >>>> > Hope this helps to understand my question. >>>> > >>>> > >>>> > >>>> > And thanks Narasimha for quick response. >>>> > >>>> > >>>> > >>>> > On 17 September 2015 at 16:17, Naganarasimha Garla >>>> > <[email protected]> wrote: >>>> > >>>> > Hi Laxman, >>>> > >>>> > Yes if cgroups are enabled and >>>> "yarn.scheduler.capacity.resource-calculator" >>>> > configured to DominantResourceCalculator then cpu and memory can be >>>> > controlled. >>>> > >>>> > Please Kindly furhter refer to the official documentation >>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html >>>> > >>>> > >>>> > >>>> > But may be if say more about problem then we can suggest ideal >>>> > configuration, seems like capacity configuration and splitting of the >>>> queue >>>> > is not rightly done or you might refer to Fair Scheduler if you want >>>> more >>>> > fairness for container allocation for different apps. >>>> > >>>> > >>>> > >>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <[email protected]> >>>> wrote: >>>> > >>>> > Hi, >>>> > >>>> > >>>> > >>>> > In YARN, do we have any way to control the amount of resources >>>> (vcores, >>>> > memory) used by an application SIMULTANEOUSLY. >>>> > >>>> > >>>> > >>>> > - In my cluster, noticed some large and long running mr-app occupied >>>> all the >>>> > slots of the queue and blocking other apps to get started. >>>> > >>>> > - I'm using Capacity schedulers (using hierarchical queues and >>>> preemption >>>> > disabled) >>>> > >>>> > - Using Hadoop version 2.6.0 >>>> > >>>> > - Did some googling around this and gone through configuration docs >>>> but I'm >>>> > not able to find anything that matches my requirement. >>>> > >>>> > >>>> > >>>> > If needed, I can provide more details on the usecase and problem. >>>> > >>>> > >>>> > >>>> > -- >>>> > >>>> > Thanks, >>>> > Laxman >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > >>>> > Thanks, >>>> > Laxman >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > >>>> > Thanks, >>>> > Laxman >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > >>>> > Thanks, >>>> > Laxman >>>> >>> >>> >>> >>> -- >>> Thanks, >>> Laxman >>> >> >> >> >> -- >> Thanks, >> Laxman >> > > > > -- > Thanks, > Laxman >
