Jason Lowe created TEZ-3368: ------------------------------- Summary: NPE in DelayedContainerManager Key: TEZ-3368 URL: https://issues.apache.org/jira/browse/TEZ-3368 Project: Apache Tez Issue Type: Bug Affects Versions: 0.7.1 Reporter: Jason Lowe
Saw a Tez AM hang due to an NPE in the DelayedContainerManager: {noformat} 2016-07-17 01:53:23,157 [ERROR] [DelayedContainerManager] |yarn.YarnUncaughtExceptionHandler|: Thread Thread[DelayedContainerManager,5,main] threw an Exception. java.lang.NullPointerException at org.apache.tez.dag.app.rm.TezAMRMClientAsync.getMatchingRequestsForTopPriority(TezAMRMClientAsync.java:142) at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getMatchingRequestWithoutPriority(YarnTaskSchedulerService.java:1474) at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$500(YarnTaskSchedulerService.java:84) at org.apache.tez.dag.app.rm.YarnTaskSchedulerService$NodeLocalContainerAssigner.assignReUsedContainer(YarnTaskSchedulerService.java:1869) at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignReUsedContainerWithLocation(YarnTaskSchedulerService.java:1753) at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignDelayedContainer(YarnTaskSchedulerService.java:733) at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$600(YarnTaskSchedulerService.java:84) at org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.run(YarnTaskSchedulerService.java:2030) {noformat} After the DelayedContainerManager thread exited the AM proceeded to receive requested containers that would go unused until the container allocations expired. Then they would be re-requested, and the cycle repeated indefinitely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)