Re: Concurrency control

Naganarasimha Garla Sat, 03 Oct 2015 18:42:42 -0700

Thanks Harsh J for the useful sharing of the info, but can we think of some
way to support this scenario from YARN side ?
like the queue configuration i mentioned or in the way Laxman mentioned
(app specific override)  ?


On Fri, Oct 2, 2015 at 10:26 PM, Laxman Ch <[email protected]> wrote:

> Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
> applications are MR.
> So, this should be sufficient for us. These configurations, I will give a
> try and post my findings again here. Thanks again.
>
> Thanks Naga, Rohit & Lloyd for your suggestions and discussion.
>
> On 2 October 2015 at 07:37, Harsh J <[email protected]> wrote:
>
>> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
>> (it can be set per-job).
>>
>> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <[email protected]> wrote:
>>
>>> Hi Naga,
>>>
>>> Like most of the app-level configurations, admin can configure the
>>> defaults which user may want override at application level.
>>>
>>> If this is at queue-level then all applications in a queue will have the
>>> same limits. But all our applications in a queue may not have same SLA and
>>> we may need to restrict them differently. This requires again splitting
>>> queues further which I feel is more overhead.
>>>
>>>
>>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>>> [email protected]> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> Ideally i understand it would be better its available @ application
>>>> level, but  its like each user is expected to ensure that he gives the
>>>> right configuration which is within the limits of max capacity.
>>>> And what if user submits some app *(kind of a query execution app**)*
>>>> with out this setting *or* he doesn't know how much it should take ?
>>>> In general, users specifying resources for containers itself is a difficult
>>>> task.
>>>> And it might not be right to expect that the admin will do it for each
>>>> application in the queue either.  Basically governing will be difficult if
>>>> its not enforced from queue/scheduler side.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [[email protected]]
>>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>>
>>>> *To:* [email protected]
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> IMO, its better to have a application level configuration than to have
>>>> a scheduler/queue level configuration.
>>>> Having a queue level configuration will restrict every single
>>>> application that runs in that queue.
>>>> But, we may want to configure these limits for only some set of jobs
>>>> and also for every application these limits can be different.
>>>>
>>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>>> adhoc jobs and scheduled/owned independently by different teams.
>>>>
>>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Laxman,
>>>>>
>>>>> What i meant was,  suppose if we support and configure
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>>> single app should not take more than 25 % of resources in the queue.
>>>>> This would be a more generic configuration which can be enforced by
>>>>> the admin, than expecting it to be configured for per app by the user.
>>>>>
>>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>>> solve the problem if the App which is submitted first is not already 
>>>>> hogged
>>>>> all the queue's resources.
>>>>>
>>>>> + Naga
>>>>>
>>>>> ------------------------------
>>>>> *From:* Laxman Ch [[email protected]]
>>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>>
>>>>> *To:* [email protected]
>>>>> *Subject:* Re: Concurrency control
>>>>>
>>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>>
>>>>> > I think Laxman should also tell us more about which application
>>>>> type he is running.
>>>>>
>>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>>> 1.5GB).
>>>>> Our problem is more about controlling the * resources used
>>>>> simultaneously by all running containers *at any given point of time
>>>>> per application.
>>>>>
>>>>> Example:
>>>>> 1. App1 and App2 are two MR apps.
>>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>>> 3. Each App1 task takes 8 hrs for completion
>>>>> 4. Each App2 task takes 5 mins for completion
>>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>>> App1 tasks to release the resources.
>>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>>> completed so far by App1.
>>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>>> jobs like this and it will explode the number of queues.
>>>>> 9. We use CapacityScheduler.
>>>>>
>>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>>> 50vcores and 75GB, then App1 may take longer time to finish but there 
>>>>> won't
>>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>>
>>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>>
>>>>> @Naga, I couldn't think through the expected behavior of "
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>>> I will revert on this.
>>>>>
>>>>> On 29 September 2015 at 14:57, Namikaze Minato <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I think Laxman should also tell us more about which application type
>>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>>> second map will have to wait until the app completes. Same would
>>>>>> happen if the applications running were spark, as spark does not free
>>>>>> what is allocated to it.
>>>>>>
>>>>>> Regards,
>>>>>> LLoyd
>>>>>>
>>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>>> <[email protected]> wrote:
>>>>>> > Thanks Rohith for your thoughts ,
>>>>>> >       But i think by this configuration it might not completely
>>>>>> solve the
>>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>>> between first
>>>>>> > and and the second app then though we have fairness or priority set
>>>>>> for apps
>>>>>> > starvation will be there.
>>>>>> > IIUC we can think of an approach where in we can have something
>>>>>> similar to
>>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>>> it can
>>>>>> > provide  the functionality like
>>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>>> multiple of
>>>>>> > the queue capacity which can be configured to allow a single app to
>>>>>> acquire
>>>>>> > more resources.  Thoughts ?
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > ________________________________
>>>>>> > From: Rohith Sharma K S [[email protected]]
>>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>>> > To: [email protected]
>>>>>> > Subject: RE: Concurrency control
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>>> configuration
>>>>>> > for configuring ordering policy.  By configuring
>>>>>> FAIR_ORDERING_POLICY in CS
>>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>>> starving of
>>>>>> > applications for resources.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>>> >
>>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>>> usage are
>>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>>> high
>>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>>> This is
>>>>>> > to offset the tendency to favor small apps, which could result in
>>>>>> starvation
>>>>>> > for large apps if many small ones enter and leave the queue
>>>>>> continuously
>>>>>> > (optional, default false)
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Community Issue Id :
>>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Thanks & Regards
>>>>>> >
>>>>>> > Rohith Sharma K S
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > From: Laxman Ch [mailto:[email protected]]
>>>>>> > Sent: 29 September 2015 13:36
>>>>>> > To: [email protected]
>>>>>> > Subject: Re: Concurrency control
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 23:21, Laxman Ch <[email protected]>
>>>>>> wrote:
>>>>>> >
>>>>>> > No Naga. That wont help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100
>>>>>> vcores) with
>>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>>> scenario,
>>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>>> app2 will
>>>>>> > starve longer.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>>> amount of
>>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>>> > <[email protected]> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > For the example you have stated may be we can do the following
>>>>>> things :
>>>>>> >
>>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>>> its
>>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>>> application
>>>>>> > will not be using the resources beyond the capacity configured
>>>>>> >
>>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>>  so that
>>>>>> > each active user would be assured with the minimum guaranteed
>>>>>> resources . By
>>>>>> > default value is 100 implies no user limits are imposed.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Additionally we can think of
>>>>>> >
>>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>>> > which will enforce strict cpu usage for a given container if
>>>>>> required.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <[email protected]>
>>>>>> wrote:
>>>>>> >
>>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>>> resources
>>>>>> > at container level. But my requirement is more about controlling the
>>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > For example, I have an application with a requirement of 1000
>>>>>> vcores. But, I
>>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>>> any point
>>>>>> > of time in the cluster/queue. This makes that application to run
>>>>>> longer even
>>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>>> of other
>>>>>> > applications.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Hope this helps to understand my question.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And thanks Narasimha for quick response.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>>> > <[email protected]> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > Yes if cgroups are enabled and
>>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>>> > controlled.
>>>>>> >
>>>>>> > Please Kindly  furhter refer to the official documentation
>>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > But may be if say more about problem then we can suggest ideal
>>>>>> > configuration, seems like capacity configuration and splitting of
>>>>>> the queue
>>>>>> > is not rightly done or you might refer to Fair Scheduler if you
>>>>>> want more
>>>>>> > fairness for container allocation for different apps.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <[email protected]>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In YARN, do we have any way to control the amount of resources
>>>>>> (vcores,
>>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > - In my cluster, noticed some large and long running mr-app
>>>>>> occupied all the
>>>>>> > slots of the queue and blocking other apps to get started.
>>>>>> >
>>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>>> preemption
>>>>>> > disabled)
>>>>>> >
>>>>>> > - Using Hadoop version 2.6.0
>>>>>> >
>>>>>> > - Did some googling around this and gone through configuration docs
>>>>>> but I'm
>>>>>> > not able to find anything that matches my requirement.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Reply via email to