[ https://issues.apache.org/jira/browse/YARN-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prabhu Joseph resolved YARN-4730. --------------------------------- Resolution: Duplicate YARN-2026 > YARN preemption based on instantaneous fair share > ------------------------------------------------- > > Key: YARN-4730 > URL: https://issues.apache.org/jira/browse/YARN-4730 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Reporter: Prabhu Joseph > > On a big cluster with Total Cluster Resource of 10TB, 3000 cores and Fair > Sheduler having 230 queues and total 60000 jobs run a day. [ all 230 queues > are very critical and hence the minResource is same for all]. On this case, > when a Spark Job is run on queue A and which occupies the entire cluster > resource and does not release any resource, another job submitted into queue > B and preemption is getting only the Fair Share which is <10TB , 3000> / 230 > = <45 GB , 13 cores> which is very less fair share for a queue.shared by many > applications. > The Preemption should get the instantaneous fair Share, that is <10TB, 3000> > / 2 (active queues) = 5TB and 1500 cores, so that the first job won't hog the > entire cluster resource and also the subsequent jobs run fine. > This issue is only when the number of queues are very high. In case of less > number of queues, Preemption getting Fair Share would be suffice as the fair > share will be high. But in case of too many number of queues, Preemption > should try to get the instantaneous Fair Share. > Note: Configuring optimal maxResources to 230 queues is difficult and also > putting constraint for the queues using maxResource will leave cluster > resource idle most of the time. > There are 1000s of Spark Jobs, so asking each user to restrict the > number of executors is also difficult. > Preempting Instantaneous Fair Share will help to overcome the above issues. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)