[jira] [Commented] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114123#comment-17114123 ] YCozy commented on YARN-9194: - Hi, we were able to trigger the same bug (LAUNCH_FAILED at FAILED) in 2.10.0. Can we also backport the fix to that version? Thanks! > Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and > NullPointerException happens in RM while shutdown a NM > - > > Key: YARN-9194 > URL: https://issues.apache.org/jira/browse/YARN-9194 > Project: Hadoop YARN > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Fix For: 3.1.2, 3.3.0, 3.2.1 > > Attachments: YARN-9194_1.patch, YARN-9194_2.patch, YARN-9194_3.patch, > YARN-9194_4.patch, YARN-9194_5.patch, YARN-9194_6.patch, > hadoop-hires-resourcemanager-hadoop11.log > > > While the attempt fails, the REGISTERED comes, hence the > InvalidStateTransitionException happens. > > {code:java} > 2019-01-13 00:41:57,127 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > App attempt: appattempt_1547311267249_0001_02 can't handle this event at > current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > REGISTERED at FAILED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:913) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1073) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1054) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757913#comment-16757913 ] lujie commented on YARN-9194: - Hi: [~leftnoteasy] and [~sunilg] Yeah I have found the error, and give the patch in YARN-9248 , since you have created the issue, I will move the patch to this YARN-9262 . Thanks for your report. > Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and > NullPointerException happens in RM while shutdown a NM > - > > Key: YARN-9194 > URL: https://issues.apache.org/jira/browse/YARN-9194 > Project: Hadoop YARN > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Fix For: 3.1.2, 3.3.0, 3.2.1 > > Attachments: YARN-9194_1.patch, YARN-9194_2.patch, YARN-9194_3.patch, > YARN-9194_4.patch, YARN-9194_5.patch, YARN-9194_6.patch, > hadoop-hires-resourcemanager-hadoop11.log > > > While the attempt fails, the REGISTERED comes, hence the > InvalidStateTransitionException happens. > > {code:java} > 2019-01-13 00:41:57,127 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > App attempt: appattempt_1547311267249_0001_02 can't handle this event at > current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > REGISTERED at FAILED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:913) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1073) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1054) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757905#comment-16757905 ] Sunil Govindan commented on YARN-9194: -- hadoop.yarn.server.resourcemanager.rmapp.attempt.TestRMAppAttemptTransitions is failing post this patch. Created YARN-9262 > Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and > NullPointerException happens in RM while shutdown a NM > - > > Key: YARN-9194 > URL: https://issues.apache.org/jira/browse/YARN-9194 > Project: Hadoop YARN > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Fix For: 3.1.2, 3.3.0, 3.2.1 > > Attachments: YARN-9194_1.patch, YARN-9194_2.patch, YARN-9194_3.patch, > YARN-9194_4.patch, YARN-9194_5.patch, YARN-9194_6.patch, > hadoop-hires-resourcemanager-hadoop11.log > > > While the attempt fails, the REGISTERED comes, hence the > InvalidStateTransitionException happens. > > {code:java} > 2019-01-13 00:41:57,127 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > App attempt: appattempt_1547311267249_0001_02 can't handle this event at > current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > REGISTERED at FAILED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:913) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1073) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1054) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748109#comment-16748109 ] Wangda Tan commented on YARN-9194: -- Cherry-picked to branch-3.1.2 as well. > Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and > NullPointerException happens in RM while shutdown a NM > - > > Key: YARN-9194 > URL: https://issues.apache.org/jira/browse/YARN-9194 > Project: Hadoop YARN > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Fix For: 3.1.2, 3.3.0, 3.2.1 > > Attachments: YARN-9194_1.patch, YARN-9194_2.patch, YARN-9194_3.patch, > YARN-9194_4.patch, YARN-9194_5.patch, YARN-9194_6.patch, > hadoop-hires-resourcemanager-hadoop11.log > > > While the attempt fails, the REGISTERED comes, hence the > InvalidStateTransitionException happens. > > {code:java} > 2019-01-13 00:41:57,127 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > App attempt: appattempt_1547311267249_0001_02 can't handle this event at > current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > REGISTERED at FAILED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:913) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1073) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1054) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745617#comment-16745617 ] Hudson commented on YARN-9194: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #15785 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15785/]) YARN-9194. Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and (wangda: rev 6d7eedfd28cc1712690db2f6ca8a281b0901ee28) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java > Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and > NullPointerException happens in RM while shutdown a NM > - > > Key: YARN-9194 > URL: https://issues.apache.org/jira/browse/YARN-9194 > Project: Hadoop YARN > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: YARN-9194_1.patch, YARN-9194_2.patch, YARN-9194_3.patch, > YARN-9194_4.patch, YARN-9194_5.patch, YARN-9194_6.patch, > hadoop-hires-resourcemanager-hadoop11.log > > > While the attempt fails, the REGISTERED comes, hence the > InvalidStateTransitionException happens. > > {code:java} > 2019-01-13 00:41:57,127 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > App attempt: appattempt_1547311267249_0001_02 can't handle this event at > current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > REGISTERED at FAILED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:913) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1073) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1054) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16744630#comment-16744630 ] lujie commented on YARN-9194: - Hi: [~leftnoteasy] [~wilfreds] The UT failure seems to not be related to this patch, my local UT works well. and could you please review the latest patch which add the fix for YARN-9201, i have attatched the RM log in its issues. > Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and > NullPointerException happens in RM while shutdown a NM > - > > Key: YARN-9194 > URL: https://issues.apache.org/jira/browse/YARN-9194 > Project: Hadoop YARN > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Attachments: YARN-9194_1.patch, YARN-9194_2.patch, YARN-9194_3.patch, > YARN-9194_4.patch, YARN-9194_5.patch, YARN-9194_6.patch, > hadoop-hires-resourcemanager-hadoop11.log > > > While the attempt fails, the REGISTERED comes, hence the > InvalidStateTransitionException happens. > > {code:java} > 2019-01-13 00:41:57,127 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > App attempt: appattempt_1547311267249_0001_02 can't handle this event at > current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > REGISTERED at FAILED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:913) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1073) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1054) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org