Giovanni Matteo Fumarola created YARN-3369: ----------------------------------------------
Summary: Missing NullPointer check in AppSchedulingInfo causes RM to die Key: YARN-3369 URL: https://issues.apache.org/jira/browse/YARN-3369 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.6.0 Reporter: Giovanni Matteo Fumarola In AppSchedulingInfo.java the method checkForDeactivation() has these 2 consecutive lines: {quote} {color:red} ResourceRequest request = getResourceRequest(priority, ResourceRequest.ANY); if (request.getNumContainers() > 0) { {color} {quote} the first line calls getResourceRequest and it can return null. {quote} synchronized public ResourceRequest getResourceRequest( Priority priority, String resourceName) { Map<String, ResourceRequest> nodeRequests = requests.get(priority); {color:red} *return* {color} (nodeRequests == null) ? {color:red} *null* {color} : nodeRequests.get(resourceName); } {quote} The second line dereferences the pointer directly without a check. If the pointer is null, the RM dies. {quote}2015-03-17 14:14:04,757 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739) at java.lang.Thread.run(Thread.java:722) {color:red} *2015-03-17 14:14:04,758 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..*{color} {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)