[
https://issues.apache.org/jira/browse/TEZ-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15555536#comment-15555536
]
Jason Lowe commented on TEZ-3368:
---------------------------------
bq. is this patch still ready to go in?
Yes, I believe so.
> NPE in DelayedContainerManager
> ------------------------------
>
> Key: TEZ-3368
> URL: https://issues.apache.org/jira/browse/TEZ-3368
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: TEZ-3368.001.patch
>
>
> Saw a Tez AM hang due to an NPE in the DelayedContainerManager:
> {noformat}
> 2016-07-17 01:53:23,157 [ERROR] [DelayedContainerManager]
> |yarn.YarnUncaughtExceptionHandler|: Thread
> Thread[DelayedContainerManager,5,main] threw an Exception.
> java.lang.NullPointerException
> at
> org.apache.tez.dag.app.rm.TezAMRMClientAsync.getMatchingRequestsForTopPriority(TezAMRMClientAsync.java:142)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getMatchingRequestWithoutPriority(YarnTaskSchedulerService.java:1474)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$500(YarnTaskSchedulerService.java:84)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService$NodeLocalContainerAssigner.assignReUsedContainer(YarnTaskSchedulerService.java:1869)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignReUsedContainerWithLocation(YarnTaskSchedulerService.java:1753)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignDelayedContainer(YarnTaskSchedulerService.java:733)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$600(YarnTaskSchedulerService.java:84)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.run(YarnTaskSchedulerService.java:2030)
> {noformat}
> After the DelayedContainerManager thread exited the AM proceeded to receive
> requested containers that would go unused until the container allocations
> expired. Then they would be re-requested, and the cycle repeated
> indefinitely.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)