MENG DING created YARN-4230: ------------------------------- Summary: Increasing container resource while there is no headroom left will cause ResourceManager to crash Key: YARN-4230 URL: https://issues.apache.org/jira/browse/YARN-4230 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: MENG DING Assignee: MENG DING Priority: Critical
This issue was found while doing end-to-end test of YARN-1197 in YARN-4175. When increasing resource of a container, if there is no headroom left for the user, the ResourceManager crashes with NPE. The following is the stack trace: {code} 15/10/05 20:35:21 INFO capacity.ParentQueue: assignedContainer queue=root usedCapacity=0.9375 absoluteUsedCapacity=0.9375 used=<memory:15360, vCores:9> cluster=<memory:16384, vCores:16> 15/10/05 20:35:49 FATAL resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.IncreaseContainerAllocator.assignContainers(IncreaseContainerAllocator.java:327) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:66) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:474) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:819) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:572) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:423) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1177) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1274) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:134) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) at java.lang.Thread.run(Thread.java:745) 15/10/05 20:35:49 INFO resourcemanager.ResourceManager: Exiting, bbye.. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)