[ 
https://issues.apache.org/jira/browse/YARN-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph resolved YARN-4730.
---------------------------------
    Resolution: Duplicate

YARN-2026

> YARN preemption based on instantaneous fair share
> -------------------------------------------------
>
>                 Key: YARN-4730
>                 URL: https://issues.apache.org/jira/browse/YARN-4730
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>            Reporter: Prabhu Joseph
>
> On a big cluster with Total Cluster Resource of 10TB, 3000 cores and Fair 
> Sheduler having 230 queues and total 60000 jobs run a day. [ all 230 queues 
> are very critical and hence the minResource is same for all]. On this case, 
> when a Spark Job is run on queue A and which occupies the entire cluster 
> resource and does not release any resource, another job submitted into queue 
> B and preemption is getting only the Fair Share which is <10TB , 3000> / 230 
> = <45 GB , 13 cores> which is very less fair share for a queue.shared by many 
> applications. 
> The Preemption should get the instantaneous fair Share, that is <10TB, 3000> 
> / 2 (active queues) = 5TB and 1500 cores, so that the first job won't hog the 
> entire cluster resource and also the subsequent jobs run fine.
> This issue is only when the number of queues are very high. In case of less 
> number of queues, Preemption getting Fair Share would be suffice as the fair 
> share will be high. But in case of too many number of queues, Preemption 
> should try to get the instantaneous Fair Share.
> Note: Configuring optimal maxResources to 230 queues is difficult and also 
> putting constraint for the queues using maxResource will leave  cluster 
> resource idle most of the time.
>         There are 1000s of Spark Jobs, so asking each user to restrict the 
> number of executors is also difficult.
> Preempting Instantaneous Fair Share will help to overcome the above issues.
>           



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to