Re: Concurrency control

Harsh J Thu, 01 Oct 2015 19:08:06 -0700

If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
(it can be set per-job).


On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <[email protected]> wrote:

> Hi Naga,
>
> Like most of the app-level configurations, admin can configure the
> defaults which user may want override at application level.
>
> If this is at queue-level then all applications in a queue will have the
> same limits. But all our applications in a queue may not have same SLA and
> we may need to restrict them differently. This requires again splitting
> queues further which I feel is more overhead.
>
>
> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
> [email protected]> wrote:
>
>> Hi Laxman,
>>
>> Ideally i understand it would be better its available @ application
>> level, but  its like each user is expected to ensure that he gives the
>> right configuration which is within the limits of max capacity.
>> And what if user submits some app *(kind of a query execution app**)*
>> with out this setting *or* he doesn't know how much it should take ? In
>> general, users specifying resources for containers itself is a difficult
>> task.
>> And it might not be right to expect that the admin will do it for each
>> application in the queue either.  Basically governing will be difficult if
>> its not enforced from queue/scheduler side.
>>
>> + Naga
>>
>> ------------------------------
>> *From:* Laxman Ch [[email protected]]
>> *Sent:* Tuesday, September 29, 2015 16:52
>>
>> *To:* [email protected]
>> *Subject:* Re: Concurrency control
>>
>> IMO, its better to have a application level configuration than to have a
>> scheduler/queue level configuration.
>> Having a queue level configuration will restrict every single application
>> that runs in that queue.
>> But, we may want to configure these limits for only some set of jobs and
>> also for every application these limits can be different.
>>
>> FairOrdering policy thing, order of jobs can't be enforced as these are
>> adhoc jobs and scheduled/owned independently by different teams.
>>
>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>> [email protected]> wrote:
>>
>>> Hi Laxman,
>>>
>>> What i meant was,  suppose if we support and configure
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>> single app should not take more than 25 % of resources in the queue.
>>> This would be a more generic configuration which can be enforced by the
>>> admin, than expecting it to be configured for per app by the user.
>>>
>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>> solve the problem if the App which is submitted first is not already hogged
>>> all the queue's resources.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [[email protected]]
>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>
>>> *To:* [email protected]
>>> *Subject:* Re: Concurrency control
>>>
>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>
>>> > I think Laxman should also tell us more about which application type he
>>> is running.
>>>
>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>> 1.5GB).
>>> Our problem is more about controlling the * resources used
>>> simultaneously by all running containers *at any given point of time
>>> per application.
>>>
>>> Example:
>>> 1. App1 and App2 are two MR apps.
>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>> 3. Each App1 task takes 8 hrs for completion
>>> 4. Each App2 task takes 5 mins for completion
>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot App1
>>> tasks to release the resources.
>>> 7. We can't have preemption enabled as we don't want to lose the work
>>> completed so far by App1.
>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>> jobs like this and it will explode the number of queues.
>>> 9. We use CapacityScheduler.
>>>
>>> In this scenario, if I can control App1 concurrent usage limits to
>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>> be any starvation for App2 (and other jobs running in same queue)
>>>
>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>
>>> @Naga, I couldn't think through the expected behavior of "
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>> I will revert on this.
>>>
>>> On 29 September 2015 at 14:57, Namikaze Minato <[email protected]>
>>> wrote:
>>>
>>>> I think Laxman should also tell us more about which application type
>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>> second map will have to wait until the app completes. Same would
>>>> happen if the applications running were spark, as spark does not free
>>>> what is allocated to it.
>>>>
>>>> Regards,
>>>> LLoyd
>>>>
>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>> <[email protected]> wrote:
>>>> > Thanks Rohith for your thoughts ,
>>>> >       But i think by this configuration it might not completely solve
>>>> the
>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>> between first
>>>> > and and the second app then though we have fairness or priority set
>>>> for apps
>>>> > starvation will be there.
>>>> > IIUC we can think of an approach where in we can have something
>>>> similar to
>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in it
>>>> can
>>>> > provide  the functionality like
>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>> multiple of
>>>> > the queue capacity which can be configured to allow a single app to
>>>> acquire
>>>> > more resources.  Thoughts ?
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > ________________________________
>>>> > From: Rohith Sharma K S [[email protected]]
>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>> > To: [email protected]
>>>> > Subject: RE: Concurrency control
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> >
>>>> >
>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>> configuration
>>>> > for configuring ordering policy.  By configuring FAIR_ORDERING_POLICY
>>>> in CS
>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>> starving of
>>>> > applications for resources.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>> >
>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>> usage are
>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>> high
>>>> > demand may be prioritized ahead of an application with less usage.
>>>> This is
>>>> > to offset the tendency to favor small apps, which could result in
>>>> starvation
>>>> > for large apps if many small ones enter and leave the queue
>>>> continuously
>>>> > (optional, default false)
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Community Issue Id :  https://issues.apache.org/jira/browse/YARN-3463
>>>> >
>>>> >
>>>> >
>>>> > Thanks & Regards
>>>> >
>>>> > Rohith Sharma K S
>>>> >
>>>> >
>>>> >
>>>> > From: Laxman Ch [mailto:[email protected]]
>>>> > Sent: 29 September 2015 13:36
>>>> > To: [email protected]
>>>> > Subject: Re: Concurrency control
>>>> >
>>>> >
>>>> >
>>>> > Bouncing this thread again. Any other thoughts please?
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 23:21, Laxman Ch <[email protected]>
>>>> wrote:
>>>> >
>>>> > No Naga. That wont help.
>>>> >
>>>> >
>>>> >
>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>> with
>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>> scenario,
>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>> app2 will
>>>> > starve longer.
>>>> >
>>>> >
>>>> >
>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>> amount of
>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>> > <[email protected]> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > For the example you have stated may be we can do the following things
>>>> :
>>>> >
>>>> > 1. Create/modify the queue with capacity and max cap set such that its
>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>> application
>>>> > will not be using the resources beyond the capacity configured
>>>> >
>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>  so that
>>>> > each active user would be assured with the minimum guaranteed
>>>> resources . By
>>>> > default value is 100 implies no user limits are imposed.
>>>> >
>>>> >
>>>> >
>>>> > Additionally we can think of
>>>> >
>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>> > which will enforce strict cpu usage for a given container if required.
>>>> >
>>>> >
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <[email protected]>
>>>> wrote:
>>>> >
>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>> resources
>>>> > at container level. But my requirement is more about controlling the
>>>> > concurrent resource usage of an application at whole cluster level.
>>>> >
>>>> >
>>>> >
>>>> > And yes, we do configure queues properly. But, that won't help.
>>>> >
>>>> >
>>>> >
>>>> > For example, I have an application with a requirement of 1000 vcores.
>>>> But, I
>>>> > wanted to control this application not to go beyond 100 vcores at any
>>>> point
>>>> > of time in the cluster/queue. This makes that application to run
>>>> longer even
>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>> of other
>>>> > applications.
>>>> >
>>>> >
>>>> >
>>>> > Hope this helps to understand my question.
>>>> >
>>>> >
>>>> >
>>>> > And thanks Narasimha for quick response.
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>> > <[email protected]> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > Yes if cgroups are enabled and
>>>> "yarn.scheduler.capacity.resource-calculator"
>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>> > controlled.
>>>> >
>>>> > Please Kindly  furhter refer to the official documentation
>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>> >
>>>> >
>>>> >
>>>> > But may be if say more about problem then we can suggest ideal
>>>> > configuration, seems like capacity configuration and splitting of the
>>>> queue
>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>> more
>>>> > fairness for container allocation for different apps.
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <[email protected]>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> >
>>>> >
>>>> > In YARN, do we have any way to control the amount of resources
>>>> (vcores,
>>>> > memory) used by an application SIMULTANEOUSLY.
>>>> >
>>>> >
>>>> >
>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>> all the
>>>> > slots of the queue and blocking other apps to get started.
>>>> >
>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>> preemption
>>>> > disabled)
>>>> >
>>>> > - Using Hadoop version 2.6.0
>>>> >
>>>> > - Did some googling around this and gone through configuration docs
>>>> but I'm
>>>> > not able to find anything that matches my requirement.
>>>> >
>>>> >
>>>> >
>>>> > If needed, I can provide more details on the usecase and problem.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Reply via email to