Thanks Harsh J for the useful sharing of the info, but can we think of some way to support this scenario from YARN side ? like the queue configuration i mentioned or in the way Laxman mentioned (app specific override) ?
On Fri, Oct 2, 2015 at 10:26 PM, Laxman Ch <[email protected]> wrote: > Thanks and Perfect Harsh. Exactly what I am looking for. Most of our > applications are MR. > So, this should be sufficient for us. These configurations, I will give a > try and post my findings again here. Thanks again. > > Thanks Naga, Rohit & Lloyd for your suggestions and discussion. > > On 2 October 2015 at 07:37, Harsh J <[email protected]> wrote: > >> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583 >> (it can be set per-job). >> >> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <[email protected]> wrote: >> >>> Hi Naga, >>> >>> Like most of the app-level configurations, admin can configure the >>> defaults which user may want override at application level. >>> >>> If this is at queue-level then all applications in a queue will have the >>> same limits. But all our applications in a queue may not have same SLA and >>> we may need to restrict them differently. This requires again splitting >>> queues further which I feel is more overhead. >>> >>> >>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) < >>> [email protected]> wrote: >>> >>>> Hi Laxman, >>>> >>>> Ideally i understand it would be better its available @ application >>>> level, but its like each user is expected to ensure that he gives the >>>> right configuration which is within the limits of max capacity. >>>> And what if user submits some app *(kind of a query execution app**)* >>>> with out this setting *or* he doesn't know how much it should take ? >>>> In general, users specifying resources for containers itself is a difficult >>>> task. >>>> And it might not be right to expect that the admin will do it for each >>>> application in the queue either. Basically governing will be difficult if >>>> its not enforced from queue/scheduler side. >>>> >>>> + Naga >>>> >>>> ------------------------------ >>>> *From:* Laxman Ch [[email protected]] >>>> *Sent:* Tuesday, September 29, 2015 16:52 >>>> >>>> *To:* [email protected] >>>> *Subject:* Re: Concurrency control >>>> >>>> IMO, its better to have a application level configuration than to have >>>> a scheduler/queue level configuration. >>>> Having a queue level configuration will restrict every single >>>> application that runs in that queue. >>>> But, we may want to configure these limits for only some set of jobs >>>> and also for every application these limits can be different. >>>> >>>> FairOrdering policy thing, order of jobs can't be enforced as these are >>>> adhoc jobs and scheduled/owned independently by different teams. >>>> >>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) < >>>> [email protected]> wrote: >>>> >>>>> Hi Laxman, >>>>> >>>>> What i meant was, suppose if we support and configure >>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25 then a >>>>> single app should not take more than 25 % of resources in the queue. >>>>> This would be a more generic configuration which can be enforced by >>>>> the admin, than expecting it to be configured for per app by the user. >>>>> >>>>> And for Rohith's suggestion of FairOrdering policy , I think it should >>>>> solve the problem if the App which is submitted first is not already >>>>> hogged >>>>> all the queue's resources. >>>>> >>>>> + Naga >>>>> >>>>> ------------------------------ >>>>> *From:* Laxman Ch [[email protected]] >>>>> *Sent:* Tuesday, September 29, 2015 16:03 >>>>> >>>>> *To:* [email protected] >>>>> *Subject:* Re: Concurrency control >>>>> >>>>> Thanks Rohit, Naga and Lloyd for the responses. >>>>> >>>>> > I think Laxman should also tell us more about which application >>>>> type he is running. >>>>> >>>>> We run mr jobs mostly with default core/memory allocation (1 vcore, >>>>> 1.5GB). >>>>> Our problem is more about controlling the * resources used >>>>> simultaneously by all running containers *at any given point of time >>>>> per application. >>>>> >>>>> Example: >>>>> 1. App1 and App2 are two MR apps. >>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB). >>>>> 3. Each App1 task takes 8 hrs for completion >>>>> 4. Each App2 task takes 5 mins for completion >>>>> 5. App1 triggered at time "t1" and using all the slots of queue. >>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot >>>>> App1 tasks to release the resources. >>>>> 7. We can't have preemption enabled as we don't want to lose the work >>>>> completed so far by App1. >>>>> 8. We can't have separate queues for App1 and App2 as we have lots of >>>>> jobs like this and it will explode the number of queues. >>>>> 9. We use CapacityScheduler. >>>>> >>>>> In this scenario, if I can control App1 concurrent usage limits to >>>>> 50vcores and 75GB, then App1 may take longer time to finish but there >>>>> won't >>>>> be any starvation for App2 (and other jobs running in same queue) >>>>> >>>>> @Rohit, FairOrdering policy may not solve this starvation problem. >>>>> >>>>> @Naga, I couldn't think through the expected behavior of " >>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor" >>>>> I will revert on this. >>>>> >>>>> On 29 September 2015 at 14:57, Namikaze Minato <[email protected]> >>>>> wrote: >>>>> >>>>>> I think Laxman should also tell us more about which application type >>>>>> he is running. The normal use cas of MAPREDUCE should be working as >>>>>> intended, but if he has for example one MAP using 100 vcores, then the >>>>>> second map will have to wait until the app completes. Same would >>>>>> happen if the applications running were spark, as spark does not free >>>>>> what is allocated to it. >>>>>> >>>>>> Regards, >>>>>> LLoyd >>>>>> >>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga) >>>>>> <[email protected]> wrote: >>>>>> > Thanks Rohith for your thoughts , >>>>>> > But i think by this configuration it might not completely >>>>>> solve the >>>>>> > scenario mentioned by Laxman, As if the there is some time gap >>>>>> between first >>>>>> > and and the second app then though we have fairness or priority set >>>>>> for apps >>>>>> > starvation will be there. >>>>>> > IIUC we can think of an approach where in we can have something >>>>>> similar to >>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor" where in >>>>>> it can >>>>>> > provide the functionality like >>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The >>>>>> multiple of >>>>>> > the queue capacity which can be configured to allow a single app to >>>>>> acquire >>>>>> > more resources. Thoughts ? >>>>>> > >>>>>> > + Naga >>>>>> > >>>>>> > >>>>>> > >>>>>> > ________________________________ >>>>>> > From: Rohith Sharma K S [[email protected]] >>>>>> > Sent: Tuesday, September 29, 2015 14:07 >>>>>> > To: [email protected] >>>>>> > Subject: RE: Concurrency control >>>>>> > >>>>>> > Hi Laxman, >>>>>> > >>>>>> > >>>>>> > >>>>>> > In Hadoop-2.8(Not released yet), CapacityScheduler provides >>>>>> configuration >>>>>> > for configuring ordering policy. By configuring >>>>>> FAIR_ORDERING_POLICY in CS >>>>>> > , probably you should be able to achieve your goal i.e avoiding >>>>>> starving of >>>>>> > applications for resources. >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S> >>>>>> > >>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see >>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser >>>>>> usage are >>>>>> > lesser. If sizedBasedWeight is set to true then an application with >>>>>> high >>>>>> > demand may be prioritized ahead of an application with less usage. >>>>>> This is >>>>>> > to offset the tendency to favor small apps, which could result in >>>>>> starvation >>>>>> > for large apps if many small ones enter and leave the queue >>>>>> continuously >>>>>> > (optional, default false) >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > Community Issue Id : >>>>>> https://issues.apache.org/jira/browse/YARN-3463 >>>>>> > >>>>>> > >>>>>> > >>>>>> > Thanks & Regards >>>>>> > >>>>>> > Rohith Sharma K S >>>>>> > >>>>>> > >>>>>> > >>>>>> > From: Laxman Ch [mailto:[email protected]] >>>>>> > Sent: 29 September 2015 13:36 >>>>>> > To: [email protected] >>>>>> > Subject: Re: Concurrency control >>>>>> > >>>>>> > >>>>>> > >>>>>> > Bouncing this thread again. Any other thoughts please? >>>>>> > >>>>>> > >>>>>> > >>>>>> > On 17 September 2015 at 23:21, Laxman Ch <[email protected]> >>>>>> wrote: >>>>>> > >>>>>> > No Naga. That wont help. >>>>>> > >>>>>> > >>>>>> > >>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 >>>>>> vcores) with >>>>>> > same user which runs in same queue (capacity=100vcores). In this >>>>>> scenario, >>>>>> > if app1 triggers first occupies all the slots and runs longs then >>>>>> app2 will >>>>>> > starve longer. >>>>>> > >>>>>> > >>>>>> > >>>>>> > Let me reiterate my problem statement. I wanted "to control the >>>>>> amount of >>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY" >>>>>> > >>>>>> > >>>>>> > >>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla >>>>>> > <[email protected]> wrote: >>>>>> > >>>>>> > Hi Laxman, >>>>>> > >>>>>> > For the example you have stated may be we can do the following >>>>>> things : >>>>>> > >>>>>> > 1. Create/modify the queue with capacity and max cap set such that >>>>>> its >>>>>> > equivalent to 100 vcores. So as there is no elasticity, given >>>>>> application >>>>>> > will not be using the resources beyond the capacity configured >>>>>> > >>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent >>>>>> so that >>>>>> > each active user would be assured with the minimum guaranteed >>>>>> resources . By >>>>>> > default value is 100 implies no user limits are imposed. >>>>>> > >>>>>> > >>>>>> > >>>>>> > Additionally we can think of >>>>>> > >>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" >>>>>> > which will enforce strict cpu usage for a given container if >>>>>> required. >>>>>> > >>>>>> > >>>>>> > >>>>>> > + Naga >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <[email protected]> >>>>>> wrote: >>>>>> > >>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the >>>>>> resources >>>>>> > at container level. But my requirement is more about controlling the >>>>>> > concurrent resource usage of an application at whole cluster level. >>>>>> > >>>>>> > >>>>>> > >>>>>> > And yes, we do configure queues properly. But, that won't help. >>>>>> > >>>>>> > >>>>>> > >>>>>> > For example, I have an application with a requirement of 1000 >>>>>> vcores. But, I >>>>>> > wanted to control this application not to go beyond 100 vcores at >>>>>> any point >>>>>> > of time in the cluster/queue. This makes that application to run >>>>>> longer even >>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs >>>>>> of other >>>>>> > applications. >>>>>> > >>>>>> > >>>>>> > >>>>>> > Hope this helps to understand my question. >>>>>> > >>>>>> > >>>>>> > >>>>>> > And thanks Narasimha for quick response. >>>>>> > >>>>>> > >>>>>> > >>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla >>>>>> > <[email protected]> wrote: >>>>>> > >>>>>> > Hi Laxman, >>>>>> > >>>>>> > Yes if cgroups are enabled and >>>>>> "yarn.scheduler.capacity.resource-calculator" >>>>>> > configured to DominantResourceCalculator then cpu and memory can be >>>>>> > controlled. >>>>>> > >>>>>> > Please Kindly furhter refer to the official documentation >>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html >>>>>> > >>>>>> > >>>>>> > >>>>>> > But may be if say more about problem then we can suggest ideal >>>>>> > configuration, seems like capacity configuration and splitting of >>>>>> the queue >>>>>> > is not rightly done or you might refer to Fair Scheduler if you >>>>>> want more >>>>>> > fairness for container allocation for different apps. >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <[email protected]> >>>>>> wrote: >>>>>> > >>>>>> > Hi, >>>>>> > >>>>>> > >>>>>> > >>>>>> > In YARN, do we have any way to control the amount of resources >>>>>> (vcores, >>>>>> > memory) used by an application SIMULTANEOUSLY. >>>>>> > >>>>>> > >>>>>> > >>>>>> > - In my cluster, noticed some large and long running mr-app >>>>>> occupied all the >>>>>> > slots of the queue and blocking other apps to get started. >>>>>> > >>>>>> > - I'm using Capacity schedulers (using hierarchical queues and >>>>>> preemption >>>>>> > disabled) >>>>>> > >>>>>> > - Using Hadoop version 2.6.0 >>>>>> > >>>>>> > - Did some googling around this and gone through configuration docs >>>>>> but I'm >>>>>> > not able to find anything that matches my requirement. >>>>>> > >>>>>> > >>>>>> > >>>>>> > If needed, I can provide more details on the usecase and problem. >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > >>>>>> > Thanks, >>>>>> > Laxman >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > >>>>>> > Thanks, >>>>>> > Laxman >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > >>>>>> > Thanks, >>>>>> > Laxman >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > >>>>>> > Thanks, >>>>>> > Laxman >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks, >>>>> Laxman >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks, >>>> Laxman >>>> >>> >>> >>> >>> -- >>> Thanks, >>> Laxman >>> >> > > > -- > Thanks, > Laxman >
