Hi Laxman,

In Hadoop-2.8(Not released  yet),  CapacityScheduler provides configuration for 
configuring ordering policy.  By configuring FAIR_ORDERING_POLICY in CS , 
probably you should be able to achieve  your goal i.e avoiding starving of 
applications for resources.


org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S<eclipse-javadoc:%E2%98%82=hadoop-yarn-server-resourcemanager/src%5C/main%5C/java%3Corg.apache.hadoop.yarn.server.resourcemanager.scheduler.policy%7BFairOrderingPolicy.java%E2%98%83FairOrderingPolicy%5dS>>
An OrderingPolicy which orders SchedulableEntities for fairness (see 
FairScheduler FairSharePolicy), generally, processes with lesser usage are 
lesser. If sizedBasedWeight is set to true then an application with high demand 
may be prioritized ahead of an application with less usage. This is to offset 
the tendency to favor small apps, which could result in starvation for large 
apps if many small ones enter and leave the queue continuously (optional, 
default false)


Community Issue Id :  https://issues.apache.org/jira/browse/YARN-3463

Thanks & Regards
Rohith Sharma K S

From: Laxman Ch [mailto:[email protected]]
Sent: 29 September 2015 13:36
To: [email protected]
Subject: Re: Concurrency control

Bouncing this thread again. Any other thoughts please?

On 17 September 2015 at 23:21, Laxman Ch 
<[email protected]<mailto:[email protected]>> wrote:
No Naga. That wont help.

I am running two applications (app1 - 100 vcores, app2 - 100 vcores) with same 
user which runs in same queue (capacity=100vcores). In this scenario, if app1 
triggers first occupies all the slots and runs longs then app2 will starve 
longer.

Let me reiterate my problem statement. I wanted "to control the amount of 
resources (vcores, memory) used by an application SIMULTANEOUSLY"

On 17 September 2015 at 22:28, Naganarasimha Garla 
<[email protected]<mailto:[email protected]>> wrote:
Hi Laxman,
For the example you have stated may be we can do the following things :
1. Create/modify the queue with capacity and max cap set such that its 
equivalent to 100 vcores. So as there is no elasticity, given application will 
not be using the resources beyond the capacity configured
2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent   so that 
each active user would be assured with the minimum guaranteed resources . By 
default value is 100 implies no user limits are imposed.

Additionally we can think of 
"yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" which 
will enforce strict cpu usage for a given container if required.

+ Naga

On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch 
<[email protected]<mailto:[email protected]>> wrote:
Yes. I'm already using cgroups. Cgroups helps in controlling the resources at 
container level. But my requirement is more about controlling the concurrent 
resource usage of an application at whole cluster level.

And yes, we do configure queues properly. But, that won't help.

For example, I have an application with a requirement of 1000 vcores. But, I 
wanted to control this application not to go beyond 100 vcores at any point of 
time in the cluster/queue. This makes that application to run longer even when 
my cluster is free but I will be able meet the guaranteed SLAs of other 
applications.

Hope this helps to understand my question.

And thanks Narasimha for quick response.

On 17 September 2015 at 16:17, Naganarasimha Garla 
<[email protected]<mailto:[email protected]>> wrote:
Hi Laxman,
Yes if cgroups are enabled and "yarn.scheduler.capacity.resource-calculator" 
configured to DominantResourceCalculator then cpu and memory can be controlled.
Please Kindly  furhter refer to the official documentation
http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html

But may be if say more about problem then we can suggest ideal configuration, 
seems like capacity configuration and splitting of the queue is not rightly 
done or you might refer to Fair Scheduler if you want more fairness for 
container allocation for different apps.

On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

In YARN, do we have any way to control the amount of resources (vcores, memory) 
used by an application SIMULTANEOUSLY.

- In my cluster, noticed some large and long running mr-app occupied all the 
slots of the queue and blocking other apps to get started.
- I'm using Capacity schedulers (using hierarchical queues and preemption 
disabled)
- Using Hadoop version 2.6.0
- Did some googling around this and gone through configuration docs but I'm not 
able to find anything that matches my requirement.

If needed, I can provide more details on the usecase and problem.

--
Thanks,
Laxman




--
Thanks,
Laxman




--
Thanks,
Laxman



--
Thanks,
Laxman

Reply via email to