lfxy opened a new pull request, #5627:
URL: https://github.com/apache/hadoop/pull/5627
In AbstractYarnScheduler::completeOustandingUpdatesWhichAreReserved(), after
getReservedContainer(), there is a certain possibility that the
reservedContainer is removed by scheduler. It will throw NEP and
resourcemanager would crash like below log.
2023-05-07 02:04:38,201 FATAL [SchedulerEventDispatcher:Event Processor]
org.apache.hadoop.yarn.event.EventDispatcher: Error in handling event type
CONTAINER_EXPIRED to the Event Dispatcher
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completeOustandingUpdatesWhichAreReserved(AbstractYarnScheduler.java:725)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:686)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1927)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:172)
at
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:74)
at java.lang.Thread.run(Thread.java:748)
2023-05-07 02:04:38,201 INFO [SchedulerEventDispatcher:Event Processor]
org.apache.hadoop.yarn.event.EventDispatcher: Exiting, bbye..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]