[ https://issues.apache.org/jira/browse/YARN-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438406#comment-16438406 ]
Daniel Li commented on YARN-4931: --------------------------------- Hi [~kasha], any update in addressing this issue in FairScheduler? Thanks. > Preempted resources go back to the same application > --------------------------------------------------- > > Key: YARN-4931 > URL: https://issues.apache.org/jira/browse/YARN-4931 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Affects Versions: 2.7.2 > Reporter: Miles Crawford > Priority: Major > Attachments: resourcemanager.log > > > Sometimes a queue that needs resources causes preemption - but the preempted > containers are just allocated right back to the application that just > released them! > Here is a tiny application (0007) that wants resources, and a container is > preempted from application 0002 to satisfy it: > {code} > 2016-04-07 21:08:13,463 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler > (FairSchedulerUpdateThread): Should preempt <memory:448, vCores:0> res for > queue root.default: resDueToMinShare = <memory:0, vCores:0>, > resDueToFairShare = <memory:448, vCores:0> > 2016-04-07 21:08:13,463 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler > (FairSchedulerUpdateThread): Preempting container (prio=1res=<memory:15264, > vCores:1>) from queue root.milesc > 2016-04-07 21:08:13,463 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics > (FairSchedulerUpdateThread): Non-AM container preempted, current > appAttemptId=appattempt_1460047303577_0002_000001, > containerId=container_1460047303577_0002_01_001038, resource=<memory:15264, > vCores:1> > 2016-04-07 21:08:13,463 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl > (FairSchedulerUpdateThread): container_1460047303577_0002_01_001038 Container > Transitioned from RUNNING to KILLED > {code} > But then a moment later, application 00002 gets the container right back: > {code} > 2016-04-07 21:08:13,844 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode > (ResourceManager Event Processor): Assigned container > container_1460047303577_0002_01_001039 of capacity <memory:15264, vCores:1> > on host ip-10-12-40-63.us-west-2.compute.internal:8041, which has 13 > containers, <memory:241248, vCores:18> used and <memory:416, vCores:46> > available after allocation > 2016-04-07 21:08:14,555 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl > (IPC Server handler 59 on 8030): container_1460047303577_0002_01_001039 > Container Transitioned from ALLOCATED to ACQUIRED > 2016-04-07 21:08:14,845 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl > (ResourceManager Event Processor): container_1460047303577_0002_01_001039 > Container Transitioned from ACQUIRED to RUNNING > {code} > This results in new applications being unable to even get an AM, and never > starting at all. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org