Ashwin Shankar created YARN-3453: ------------------------------------ Summary: Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing Key: YARN-3453 URL: https://issues.apache.org/jira/browse/YARN-3453 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Ashwin Shankar
There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode. Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator. Following are the two places : 1. {code:title=FSLeafQueue.java|borderStyle=solid} private boolean isStarved(Resource share) {code} A queue shouldn’t be marked as “starved” if the dominant resource usage is >= fair/minshare. 2. {code:title=FairScheduler.java|borderStyle=solid} protected Resource resToPreempt(FSLeafQueue sched, long curTime) {code} -------------------------------------------------------------- One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)