[jira] [Updated] (YARN-9928) ATSv2 can make NM go down with a FATAL error while it is resyncing with RM

2019-10-22 Thread Tarun Parimi (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated YARN-9928:
---
Component/s: ATSv2

> ATSv2 can make NM go down with a FATAL error while it is resyncing with RM
> --
>
> Key: YARN-9928
> URL: https://issues.apache.org/jira/browse/YARN-9928
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
>
> Encountered the below FATAL errorĀ in the NodeManager which was under heavy 
> load and was also resyncing with RM at the same. This caused the NM to go 
> down. 
> {code:java}
> 2019-09-18 11:22:44,899 FATAL event.AsyncDispatcher 
> (AsyncDispatcher.java:dispatch(203)) - Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerCreatedEvent(NMTimelinePublisher.java:216)
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerEvent(NMTimelinePublisher.java:383)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1520)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1511)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9928) ATSv2 can make NM go down with a FATAL error while it is resyncing with RM

2019-10-22 Thread Tarun Parimi (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated YARN-9928:
---
Affects Version/s: 3.1.0

> ATSv2 can make NM go down with a FATAL error while it is resyncing with RM
> --
>
> Key: YARN-9928
> URL: https://issues.apache.org/jira/browse/YARN-9928
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
>
> Encountered the below FATAL errorĀ in the NodeManager which was under heavy 
> load and was also resyncing with RM at the same. This caused the NM to go 
> down. 
> {code:java}
> 2019-09-18 11:22:44,899 FATAL event.AsyncDispatcher 
> (AsyncDispatcher.java:dispatch(203)) - Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerCreatedEvent(NMTimelinePublisher.java:216)
> at 
> org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerEvent(NMTimelinePublisher.java:383)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1520)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1511)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org