[
https://issues.apache.org/jira/browse/YARN-8470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610820#comment-16610820
]
ASF GitHub Bot commented on YARN-8470:
--
GitHub user gg7 opened a pull request:
https://github.com/apache/hadoop/pull/416
YARN-8470. Fix a NPE in identifyContainersToPreemptOnNode()
I encountered this issue while running 3.1.0:
```
2018-09-10 13:42:39,437 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
Container container_1536156801471_0071_01_55 completed with event FINISHED,
but corresponding RMContainer doesn't exist.
2018-09-10 13:42:39,881 ERROR
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received
RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread,
FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
2018-09-10 13:42:39,886 FATAL
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down
the resource manager.
2018-09-10 13:42:39,891 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 1: a critical thread, FSPreemptionThread, that exited unexpectedly:
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
```
I'm guessing a better fix would be to synchronise the removal of
applications, but this simple patch should be an improvement IMO.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gg7/hadoop gg7-yarn-8470-fix-npe
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/hadoop/pull/416.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #416
commit a86c54c4db3954aca40ef297135a5e875c0a96a8
Author: George G
Date: 2018-09-11T15:00:00Z
YARN-8470. Fix a NPE in identifyContainersToPreemptOnNode()
I encountered this issue while running 3.1.0:
```
2018-09-10 13:42:39,437 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
Container container_1536156801471_0071_01_55 completed with event FINISHED,
but corresponding RMContainer doesn't exist.
2018-09-10 13:42:39,881 ERROR
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received
RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread,
FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81)
2018-09-10 13:42:39,886 FATAL
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down
the resource manager.
2018-09-10 13:42:39,891 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 1: a critical thread, FSPreemptionThread, that exited unexpectedly:
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207)
at