[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Rohith Sharma K S (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925358#comment-16925358
 ] 

Rohith Sharma K S commented on YARN-9820:
-

+1 lgtm as well.

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch, 
> YARN-9820-003.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925354#comment-16925354
 ] 

Hadoop QA commented on YARN-9821:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-client in 
the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979810/YARN-9821-002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 5a91a9ee6a79 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3b9584d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24773/testReport/ |
| Max. process+thread count | 340 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbas

[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925351#comment-16925351
 ] 

Jonathan Hung commented on YARN-9820:
-

Thanks [~Prabhu Joseph]. +1 pending jenkins.

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch, 
> YARN-9820-003.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925346#comment-16925346
 ] 

Prabhu Joseph commented on YARN-9820:
-

Thanks [~jhung] and [~rohithsharma] for detailed review. Have used this 
approach in  [^YARN-9820-003.patch] .

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch, 
> YARN-9820-003.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9820:

Attachment: YARN-9820-003.patch

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch, 
> YARN-9820-003.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9816) EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError

2019-09-08 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925341#comment-16925341
 ] 

Prabhu Joseph commented on YARN-9816:
-

[~abmodi] Can you review this Jira when you get time. This ignores unexpected 
file in /ats/active directory causing EntityLogScanner thread to crash with 
StackOverflowError. Thanks. 

> EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError
> ---
>
> Key: YARN-9816
> URL: https://issues.apache.org/jira/browse/YARN-9816
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.1.0, 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9816-001.patch
>
>
> EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError.  
> This happens when a file is present under /ats/active.
> {code}
> [hdfs@node2 yarn]$ hadoop fs -ls /ats/active
> Found 1 items
> -rw-r--r--   3 hdfs hadoop  0 2019-09-06 16:34 
> /ats/active/.distcp.tmp.attempt_155759136_39768_m_01_0
> {code}
> Error Message:
> {code:java}
> java.lang.StackOverflowError
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:632)
> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
> at com.sun.proxy.$Proxy15.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2143)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1076)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1088)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1038)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1034)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1046)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.list(EntityGroupFSTimelineStore.java:398)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:368)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
>  {code}

[jira] [Commented] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925332#comment-16925332
 ] 

Prabhu Joseph commented on YARN-9821:
-

Thanks [~rohithsharma] and [~abmodi] for reviewing.

Have fixed the review comments in  [^YARN-9821-002.patch] .

> NM hangs at serviceStop when ATSV2 Backend Hbase is Down 
> -
>
> Key: YARN-9821
> URL: https://issues.apache.org/jira/browse/YARN-9821
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9821-001.patch, YARN-9821-002.patch
>
>
> NM hangs at serviceStop when ATSV2 Backend Hbase is Down.
> {code}
> "Thread-197" #302 prio=5 os_prio=0 tid=0x7f5f389ba000 nid=0x631d waiting 
> for monitor entry [0x7f5f1f29b000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:249)
>   - waiting to lock <0x0006c834d148> (a 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TypedBufferedMutator.close(TypedBufferedMutator.java:62)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceStop(HBaseTimelineWriterImpl.java:636)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05808> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.AbstractService.close(AbstractService.java:247)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceStop(TimelineCollectorManager.java:244)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceStop(NodeTimelineCollectorManager.java:164)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05890> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceStop(PerNodeTimelineCollectorsAuxService.java:113)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c058f8> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStop(AuxServices.java:330)
>   - locked <0x0006c7c23400> (a java.util.Collections$SynchronizedMap)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c059a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStop(ContainerManagerImpl.java:720)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05a98> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:526)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05c88> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager$1.run(NodeManager.java:552)
>   
>   
> "qtp183259297-76" #76 daemon prio=5 os_prio=0 tid=0x7f5f567ed000 
> nid=0x5fb7 in Object.wait() [0x7f5f23ad7000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:460)
>   at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForSpecificCompletedTask(ResultBoundedCompletionService.java:258)
>   - locked <0x000784ee8220> (a 
> [Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;)
>   at 
> org.apache.hadoop.hbase.

[jira] [Updated] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9821:

Attachment: YARN-9821-002.patch

> NM hangs at serviceStop when ATSV2 Backend Hbase is Down 
> -
>
> Key: YARN-9821
> URL: https://issues.apache.org/jira/browse/YARN-9821
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9821-001.patch, YARN-9821-002.patch
>
>
> NM hangs at serviceStop when ATSV2 Backend Hbase is Down.
> {code}
> "Thread-197" #302 prio=5 os_prio=0 tid=0x7f5f389ba000 nid=0x631d waiting 
> for monitor entry [0x7f5f1f29b000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:249)
>   - waiting to lock <0x0006c834d148> (a 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TypedBufferedMutator.close(TypedBufferedMutator.java:62)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceStop(HBaseTimelineWriterImpl.java:636)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05808> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.AbstractService.close(AbstractService.java:247)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceStop(TimelineCollectorManager.java:244)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceStop(NodeTimelineCollectorManager.java:164)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05890> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceStop(PerNodeTimelineCollectorsAuxService.java:113)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c058f8> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStop(AuxServices.java:330)
>   - locked <0x0006c7c23400> (a java.util.Collections$SynchronizedMap)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c059a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStop(ContainerManagerImpl.java:720)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05a98> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:526)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05c88> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager$1.run(NodeManager.java:552)
>   
>   
> "qtp183259297-76" #76 daemon prio=5 os_prio=0 tid=0x7f5f567ed000 
> nid=0x5fb7 in Object.wait() [0x7f5f23ad7000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:460)
>   at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForSpecificCompletedTask(ResultBoundedCompletionService.java:258)
>   - locked <0x000784ee8220> (a 
> [Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForFirstSuccessfullyCompletedTask(ResultBoundedCompletionService.java:214)
>   at 
> org.

[jira] [Commented] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Abhishek Modi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925331#comment-16925331
 ] 

Abhishek Modi commented on YARN-9821:
-

Thanks [~Prabhu Joseph] for the patch. Some minor comments:
 # Can we rename isHbaseUp => isStorageUp to make it more generic.
 # Can we log the exception too.

Apart from these minor comments, it looks good to me.

> NM hangs at serviceStop when ATSV2 Backend Hbase is Down 
> -
>
> Key: YARN-9821
> URL: https://issues.apache.org/jira/browse/YARN-9821
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9821-001.patch
>
>
> NM hangs at serviceStop when ATSV2 Backend Hbase is Down.
> {code}
> "Thread-197" #302 prio=5 os_prio=0 tid=0x7f5f389ba000 nid=0x631d waiting 
> for monitor entry [0x7f5f1f29b000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:249)
>   - waiting to lock <0x0006c834d148> (a 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TypedBufferedMutator.close(TypedBufferedMutator.java:62)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceStop(HBaseTimelineWriterImpl.java:636)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05808> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.AbstractService.close(AbstractService.java:247)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceStop(TimelineCollectorManager.java:244)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceStop(NodeTimelineCollectorManager.java:164)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05890> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceStop(PerNodeTimelineCollectorsAuxService.java:113)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c058f8> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStop(AuxServices.java:330)
>   - locked <0x0006c7c23400> (a java.util.Collections$SynchronizedMap)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c059a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStop(ContainerManagerImpl.java:720)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05a98> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:526)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05c88> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager$1.run(NodeManager.java:552)
>   
>   
> "qtp183259297-76" #76 daemon prio=5 os_prio=0 tid=0x7f5f567ed000 
> nid=0x5fb7 in Object.wait() [0x7f5f23ad7000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:460)
>   at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForSpecificCompletedTask(ResultBoundedCompletionService.java:258)
>   - locked <0x000784ee8220> (a 
> [Lorg.apache.hadoop.hbase.client.Resul

[jira] [Updated] (YARN-9349) When doTransition() method occurs exception, the log level practices are inconsistent

2019-09-08 Thread Anuhan Torgonshar (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuhan Torgonshar updated YARN-9349:

Flags:   (was: Important)

> When doTransition() method occurs exception, the log level practices are 
> inconsistent
> -
>
> Key: YARN-9349
> URL: https://issues.apache.org/jira/browse/YARN-9349
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.0, 2.8.5
>Reporter: Anuhan Torgonshar
>Priority: Major
>  Labels: easyfix
> Fix For: 3.3.0
>
> Attachments: YARN-9349.trunk.patch
>
>
> There are *inconsistent* log level practices when code catches 
> *_InvalidStateTransitionException_* for _*doTransition()*_ method.
> {code:java}
> **WARN level**
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\application\ApplicationImpl.java
>   log statement line number: 482
>   log level:warn
> **/
> try {
>// queue event requesting init of the same app
>newState = stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\localizer\LocalizedResource.java
>   log statement line number: 200
>   log level:warn
> **/
> try {
>newState = this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\container\ContainerImpl.java
>   log statement line number: 1156
>   log level:warn
> **/
> try {
> newState =
> stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
> LOG.warn("Can't handle this event at current state: Current: ["
> + oldState + "], eventType: [" + event.getType() + "]", e);
> }
> **ERROR level*
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmapp\attempt\RMAppAttemptImpl.java
> log statement line number:878
> log level: error
> **/
> try {
>/* keep the master in sync with the state machine */
>this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("App attempt: " + appAttemptID
>+ " can't handle this event at current state", e);
>onInvalidTranstion(event.getType(), oldState);
> }
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmnode\RMNodeImpl.java
> log statement line number:623
> log level: error
> **/
> try {
>stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("Can't handle this event at current state", e);
>LOG.error("Invalid event " + event.getType() + 
>" on Node " + this.nodeId);
> }
>  
> //There are 8 similar code snippets with ERROR log level.
> {code}
> After had a look on whole project, I found that there are 8 similar code 
> snippets assgin the ERROR level, when doTransition() ocurrs 
> *InvalidStateTransitionException*. And there are just 3 places choose  the 
> WARN level when in same situations. Therefor, I think these 3 log statements 
> should be assigned ERROR level to keep consistent with other code snippets.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Rohith Sharma K S (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925326#comment-16925326
 ] 

Rohith Sharma K S commented on YARN-9820:
-

I agree with [~jhung] approach. We should send notifyApp flag so that 
RMstateStore decide to trigger an event or not. 



> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9612) Support using ip to register NodeID

2019-09-08 Thread Zhankun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925321#comment-16925321
 ] 

Zhankun Tang commented on YARN-9612:


[~cane], the background and the motivation still not clear to me. :)

> Support using ip to register NodeID
> ---
>
> Key: YARN-9612
> URL: https://issues.apache.org/jira/browse/YARN-9612
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Priority: Major
>
> In the environment like k8s. We should support ip when register NodeID with 
> RM since the hostname will be podName which can not be be resolved by DNS of 
> k8s



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-09-08 Thread Zhankun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925317#comment-16925317
 ] 

Zhankun Tang commented on YARN-9605:


[~cane], Thanks for contributing this. I saw there're failures in the Jenkins 
result. Could you please try to fix them?

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Rohith Sharma K S (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925318#comment-16925318
 ] 

Rohith Sharma K S commented on YARN-9821:
-

patch looks reasonable to me.. +1. 

> NM hangs at serviceStop when ATSV2 Backend Hbase is Down 
> -
>
> Key: YARN-9821
> URL: https://issues.apache.org/jira/browse/YARN-9821
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9821-001.patch
>
>
> NM hangs at serviceStop when ATSV2 Backend Hbase is Down.
> {code}
> "Thread-197" #302 prio=5 os_prio=0 tid=0x7f5f389ba000 nid=0x631d waiting 
> for monitor entry [0x7f5f1f29b000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:249)
>   - waiting to lock <0x0006c834d148> (a 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TypedBufferedMutator.close(TypedBufferedMutator.java:62)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceStop(HBaseTimelineWriterImpl.java:636)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05808> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.AbstractService.close(AbstractService.java:247)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceStop(TimelineCollectorManager.java:244)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceStop(NodeTimelineCollectorManager.java:164)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05890> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceStop(PerNodeTimelineCollectorsAuxService.java:113)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c058f8> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStop(AuxServices.java:330)
>   - locked <0x0006c7c23400> (a java.util.Collections$SynchronizedMap)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c059a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStop(ContainerManagerImpl.java:720)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05a98> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:526)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05c88> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager$1.run(NodeManager.java:552)
>   
>   
> "qtp183259297-76" #76 daemon prio=5 os_prio=0 tid=0x7f5f567ed000 
> nid=0x5fb7 in Object.wait() [0x7f5f23ad7000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:460)
>   at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForSpecificCompletedTask(ResultBoundedCompletionService.java:258)
>   - locked <0x000784ee8220> (a 
> [Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForFirstSuccessfullyCompletedTask(ResultBoundedC

[jira] [Commented] (YARN-9739) appsTableData in AppsBlock may cause OOM

2019-09-08 Thread Zhankun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925308#comment-16925308
 ] 

Zhankun Tang commented on YARN-9739:


[~cane], Thanks for catching this point. Do you mean we should make this a 
cache to serve multiple user's request?

> appsTableData in AppsBlock may cause OOM
> 
>
> Key: YARN-9739
> URL: https://issues.apache.org/jira/browse/YARN-9739
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: zhoukang
>Priority: Major
> Attachments: heap0.png, heap1.png, stack.png
>
>
> If we have many users list the applications, it may cause RM OOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9764) Print application submission context label in application summary

2019-09-08 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925301#comment-16925301
 ] 

Hudson commented on YARN-9764:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17255 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17255/])
YARN-9764. Print application submission context label in application (jhung: 
rev 43e389b9801e09741fdf78fef067b8ac60f691c8)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java


> Print application submission context label in application summary
> -
>
> Key: YARN-9764
> URL: https://issues.apache.org/jira/browse/YARN-9764
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Manoj Kumar
>Priority: Major
>  Labels: release-blocker
> Attachments: YARN-9764.01.patch, YARN-9764.02.patch, 
> YARN-9764.branch-2.01.patch, YARN-9764.branch-2.02.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9764) Print application submission context label in application summary

2019-09-08 Thread Jonathan Hung (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9764:

Attachment: YARN-9764.branch-2.02.patch

> Print application submission context label in application summary
> -
>
> Key: YARN-9764
> URL: https://issues.apache.org/jira/browse/YARN-9764
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Manoj Kumar
>Priority: Major
>  Labels: release-blocker
> Attachments: YARN-9764.01.patch, YARN-9764.02.patch, 
> YARN-9764.branch-2.01.patch, YARN-9764.branch-2.02.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925290#comment-16925290
 ] 

Jonathan Hung edited comment on YARN-9820 at 9/9/19 1:46 AM:
-

Thanks for catching this.

Perhaps we can implement it a different way. We can add a new 
{noformat}
 public RMStateUpdateAppEvent(ApplicationStateData appState, boolean 
notifyApplication) {{noformat}
constructor to RMStateUpdateAppEvent and a new method 
{noformat}
public void updateApplicationState(ApplicationStateData appState, boolean 
notifyApp) { {noformat}
to RMStateStore, then call this new method in 
RMAppImpl#AttemptLaunchedTransition instead of 
updateApplicationState(ApplicationStateData). Previously we send an event on 
every app launch; with this approach we can avoid sending these unnecessary 
events only to ignore them later.

Thoughts?


was (Author: jhung):
Thanks for catching this.

Perhaps we can implement it a different way. We can add a new 
{noformat}
 public RMStateUpdateAppEvent(ApplicationStateData appState, boolean 
notifyApplication) {{noformat}
constructor to RMStateUpdateAppEvent and a new method 
{noformat}
public void updateApplicationState(ApplicationStateData appState, boolean 
notifyApp) { {noformat}

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925290#comment-16925290
 ] 

Jonathan Hung edited comment on YARN-9820 at 9/9/19 1:46 AM:
-

Thanks for catching this.

Perhaps we can implement it a different way. We can add a new 
{noformat}
 public RMStateUpdateAppEvent(ApplicationStateData appState, boolean 
notifyApplication) {{noformat}
constructor to RMStateUpdateAppEvent and a new method 
{noformat}
public void updateApplicationState(ApplicationStateData appState, boolean 
notifyApp) { {noformat}
to RMStateStore, then call this new method with notifyApp = false in 
RMAppImpl#AttemptLaunchedTransition instead of 
updateApplicationState(ApplicationStateData). Previously we send an event on 
every app launch; with this approach we can avoid sending these unnecessary 
events only to ignore them later.

Thoughts?


was (Author: jhung):
Thanks for catching this.

Perhaps we can implement it a different way. We can add a new 
{noformat}
 public RMStateUpdateAppEvent(ApplicationStateData appState, boolean 
notifyApplication) {{noformat}
constructor to RMStateUpdateAppEvent and a new method 
{noformat}
public void updateApplicationState(ApplicationStateData appState, boolean 
notifyApp) { {noformat}
to RMStateStore, then call this new method in 
RMAppImpl#AttemptLaunchedTransition instead of 
updateApplicationState(ApplicationStateData). Previously we send an event on 
every app launch; with this approach we can avoid sending these unnecessary 
events only to ignore them later.

Thoughts?

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925290#comment-16925290
 ] 

Jonathan Hung commented on YARN-9820:
-

Thanks for catching this.

Perhaps we can implement it a different way. We can add a new 
{noformat}
 public RMStateUpdateAppEvent(ApplicationStateData appState, boolean 
notifyApplication) {{noformat}
constructor to RMStateUpdateAppEvent and a new method 
{noformat}
public void updateApplicationState(ApplicationStateData appState, boolean 
notifyApp) { {noformat}

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925237#comment-16925237
 ] 

Hadoop QA commented on YARN-9820:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
59s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 42s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 86m 
40s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.2 Server=19.03.2 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9820 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979781/YARN-9820-002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 633134f7739a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca32917 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24771/testReport/ |
| Max. process+thread count | 792 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24771/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RM logs InvalidStateTransitionExceptio

[jira] [Commented] (YARN-9816) EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError

2019-09-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925236#comment-16925236
 ] 

Hadoop QA commented on YARN-9816:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
13s{color} | {color:green} hadoop-yarn-server-timeline-pluginstorage in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9816 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979784/YARN-9816-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f1ec3d3a7caa 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca32917 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24772/testReport/ |
| Max. process+thread count | 413 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24772/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> EntityGroupFSTime

[jira] [Updated] (YARN-9816) EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9816:

Attachment: YARN-9816-001.patch

> EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError
> ---
>
> Key: YARN-9816
> URL: https://issues.apache.org/jira/browse/YARN-9816
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.1.0, 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9816-001.patch
>
>
> EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError.  
> This happens when a file is present under /ats/active.
> {code}
> [hdfs@node2 yarn]$ hadoop fs -ls /ats/active
> Found 1 items
> -rw-r--r--   3 hdfs hadoop  0 2019-09-06 16:34 
> /ats/active/.distcp.tmp.attempt_155759136_39768_m_01_0
> {code}
> Error Message:
> {code:java}
> java.lang.StackOverflowError
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:632)
> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
> at com.sun.proxy.$Proxy15.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2143)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1076)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1088)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1038)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1034)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1046)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.list(EntityGroupFSTimelineStore.java:398)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:368)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
>  {code}
> One of our user has tried to distcp hdfs://ats/active dir. Distcp job has 
> created the 
> temp file .distcp.tmp.attempt_155759136_39768_m_01_0 and failed to 
> delete at end which has caused the

[jira] [Updated] (YARN-9816) EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9816:

Affects Version/s: 3.1.0
   3.2.0

> EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError
> ---
>
> Key: YARN-9816
> URL: https://issues.apache.org/jira/browse/YARN-9816
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.1.0, 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9816-001.patch
>
>
> EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError.  
> This happens when a file is present under /ats/active.
> {code}
> [hdfs@node2 yarn]$ hadoop fs -ls /ats/active
> Found 1 items
> -rw-r--r--   3 hdfs hadoop  0 2019-09-06 16:34 
> /ats/active/.distcp.tmp.attempt_155759136_39768_m_01_0
> {code}
> Error Message:
> {code:java}
> java.lang.StackOverflowError
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:632)
> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
> at com.sun.proxy.$Proxy15.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2143)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1076)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1088)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1038)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1034)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1046)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.list(EntityGroupFSTimelineStore.java:398)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:368)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
>  {code}
> One of our user has tried to distcp hdfs://ats/active dir. Distcp job has 
> created the 
> temp file .distcp.tmp.attempt_155759136_39768_m_01_0 and failed to 
> delete at en

[jira] [Updated] (YARN-9816) EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9816:

Description: 
EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError.  This 
happens when a file is present under /ats/active.

{code}
[hdfs@node2 yarn]$ hadoop fs -ls /ats/active
Found 1 items
-rw-r--r--   3 hdfs hadoop  0 2019-09-06 16:34 
/ats/active/.distcp.tmp.attempt_155759136_39768_m_01_0
{code}

Error Message:
{code:java}
java.lang.StackOverflowError
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:632)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
at com.sun.proxy.$Proxy15.getListing(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2143)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1076)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1088)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1038)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1034)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1046)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.list(EntityGroupFSTimelineStore.java:398)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:368)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.scanActiveLogs(EntityGroupFSTimelineStore.java:383)
 {code}

One of our user has tried to distcp hdfs://ats/active dir. Distcp job has 
created the 
temp file .distcp.tmp.attempt_155759136_39768_m_01_0 and failed to 
delete at end which has caused the crash of EntityLogScanner Thread with 
StackOverflowError.

  was:
EntityGroupFSTimelineStore#scanActiveLogs fails with StackOverflowError.  This 
happens when an Invalid applicationDir is present in /ats/active.

{code}
[hdfs@node2 yarn]$ hadoop fs -ls /ats/active
Found 1 items
-rw-r--r--   3 hdfs hadoop  0 2019-09-06 16:34 
/ats/active/.distcp.tmp.attempt_155759136_39768_m_01_0
{code}

Error Message:
{code:java}
java.lang.StackOverflowError
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:632)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at 
sun.reflect.DelegatingMetho

[jira] [Commented] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925219#comment-16925219
 ] 

Prabhu Joseph commented on YARN-9821:
-

[~abmodi] Can you review this Jira when you get time. This Fixes NodeManager 
getting blocked at serviceStop when ATSV2 backend Hbase is Down. 



> NM hangs at serviceStop when ATSV2 Backend Hbase is Down 
> -
>
> Key: YARN-9821
> URL: https://issues.apache.org/jira/browse/YARN-9821
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9821-001.patch
>
>
> NM hangs at serviceStop when ATSV2 Backend Hbase is Down.
> {code}
> "Thread-197" #302 prio=5 os_prio=0 tid=0x7f5f389ba000 nid=0x631d waiting 
> for monitor entry [0x7f5f1f29b000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:249)
>   - waiting to lock <0x0006c834d148> (a 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TypedBufferedMutator.close(TypedBufferedMutator.java:62)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceStop(HBaseTimelineWriterImpl.java:636)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05808> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.AbstractService.close(AbstractService.java:247)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceStop(TimelineCollectorManager.java:244)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceStop(NodeTimelineCollectorManager.java:164)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05890> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceStop(PerNodeTimelineCollectorsAuxService.java:113)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c058f8> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStop(AuxServices.java:330)
>   - locked <0x0006c7c23400> (a java.util.Collections$SynchronizedMap)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c059a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStop(ContainerManagerImpl.java:720)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05a98> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:526)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05c88> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager$1.run(NodeManager.java:552)
>   
>   
> "qtp183259297-76" #76 daemon prio=5 os_prio=0 tid=0x7f5f567ed000 
> nid=0x5fb7 in Object.wait() [0x7f5f23ad7000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:460)
>   at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForSpecificCompletedTask(ResultBoundedCompletionService.java:258)
>   - locked <0x000784ee8220> (a 
> [Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;)
>   at 
> org.apache.h

[jira] [Updated] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9820:

Attachment: YARN-9820-002.patch

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch, YARN-9820-002.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925210#comment-16925210
 ] 

Hadoop QA commented on YARN-9820:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 36s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 102 unchanged - 0 fixed = 103 total (was 102) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 84m 
59s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.2 Server=19.03.2 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9820 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979776/YARN-9820-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e0457c1a6d59 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca32917 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24769/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24769/testReport/ |
| Max. process+thread count | 813 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-

[jira] [Commented] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925207#comment-16925207
 ] 

Hadoop QA commented on YARN-9821:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  0s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-client in 
the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.2 Server=19.03.2 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9821 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979778/YARN-9821-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 64a92ab10bad 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca32917 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24770/testReport/ |
| Max. process+thread count | 307 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase

[jira] [Updated] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9821:

Attachment: YARN-9821-001.patch

> NM hangs at serviceStop when ATSV2 Backend Hbase is Down 
> -
>
> Key: YARN-9821
> URL: https://issues.apache.org/jira/browse/YARN-9821
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9821-001.patch
>
>
> NM hangs at serviceStop when ATSV2 Backend Hbase is Down.
> {code}
> "Thread-197" #302 prio=5 os_prio=0 tid=0x7f5f389ba000 nid=0x631d waiting 
> for monitor entry [0x7f5f1f29b000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:249)
>   - waiting to lock <0x0006c834d148> (a 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TypedBufferedMutator.close(TypedBufferedMutator.java:62)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceStop(HBaseTimelineWriterImpl.java:636)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05808> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.AbstractService.close(AbstractService.java:247)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceStop(TimelineCollectorManager.java:244)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceStop(NodeTimelineCollectorManager.java:164)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05890> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceStop(PerNodeTimelineCollectorsAuxService.java:113)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c058f8> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStop(AuxServices.java:330)
>   - locked <0x0006c7c23400> (a java.util.Collections$SynchronizedMap)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c059a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStop(ContainerManagerImpl.java:720)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05a98> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:526)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05c88> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager$1.run(NodeManager.java:552)
>   
>   
> "qtp183259297-76" #76 daemon prio=5 os_prio=0 tid=0x7f5f567ed000 
> nid=0x5fb7 in Object.wait() [0x7f5f23ad7000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:460)
>   at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForSpecificCompletedTask(ResultBoundedCompletionService.java:258)
>   - locked <0x000784ee8220> (a 
> [Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForFirstSuccessfullyCompletedTask(ResultBoundedCompletionService.java:214)
>   at 
> org.apache.hadoop.hbase.c

[jira] [Updated] (YARN-9822) TimelineCollectorWebService#putEntities blocked when ATSV2 HBase is down.

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9822:

Parent: YARN-9802
Issue Type: Sub-task  (was: Bug)

> TimelineCollectorWebService#putEntities blocked when ATSV2 HBase is down.
> -
>
> Key: YARN-9822
> URL: https://issues.apache.org/jira/browse/YARN-9822
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> TimelineCollectorWebService#putEntities blocked when ATSV2 HBase is down. 
> YARN-9374 prevents the threads getting blocked when it has already identified 
> that Hbase down before accessing Hbase. TimelineCollector can check if the 
> Writer Backend is up or down before locking the writer.
> {code}
>   synchronized (writer) {
>   response = writeTimelineEntities(entities, callerUgi);
>   flushBufferedTimelineEntities();
> }
> {code}
> {code}
> "qtp183259297-80" #80 daemon prio=5 os_prio=0 tid=0x7f5f567fd000 
> nid=0x5fbb waiting for monitor entry [0x7f5f236d4000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollector.putEntities(TimelineCollector.java:164)
>   - waiting to lock <0x0006c7c05770> (a 
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorWebService.putEntities(TimelineCollectorWebService.java:186)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
>   at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
>   at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
>   at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
>   at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:304)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1624)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:175

[jira] [Updated] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9821:

Parent: YARN-9802
Issue Type: Sub-task  (was: Bug)

> NM hangs at serviceStop when ATSV2 Backend Hbase is Down 
> -
>
> Key: YARN-9821
> URL: https://issues.apache.org/jira/browse/YARN-9821
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> NM hangs at serviceStop when ATSV2 Backend Hbase is Down.
> {code}
> "Thread-197" #302 prio=5 os_prio=0 tid=0x7f5f389ba000 nid=0x631d waiting 
> for monitor entry [0x7f5f1f29b000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:249)
>   - waiting to lock <0x0006c834d148> (a 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.common.TypedBufferedMutator.close(TypedBufferedMutator.java:62)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceStop(HBaseTimelineWriterImpl.java:636)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05808> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.AbstractService.close(AbstractService.java:247)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceStop(TimelineCollectorManager.java:244)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceStop(NodeTimelineCollectorManager.java:164)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05890> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceStop(PerNodeTimelineCollectorsAuxService.java:113)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c058f8> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStop(AuxServices.java:330)
>   - locked <0x0006c7c23400> (a java.util.Collections$SynchronizedMap)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c059a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStop(ContainerManagerImpl.java:720)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05a98> (a java.lang.Object)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:526)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
>   - locked <0x0006c7c05c88> (a java.lang.Object)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager$1.run(NodeManager.java:552)
>   
>   
> "qtp183259297-76" #76 daemon prio=5 os_prio=0 tid=0x7f5f567ed000 
> nid=0x5fb7 in Object.wait() [0x7f5f23ad7000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:460)
>   at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForSpecificCompletedTask(ResultBoundedCompletionService.java:258)
>   - locked <0x000784ee8220> (a 
> [Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForFirstSuccessfullyCompletedTask(ResultBoundedCompletionService.java:214)
>   at 
> org.apache.hadoop.hbase.client.ScannerCalla

[jira] [Assigned] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph reassigned YARN-9820:
---

Assignee: Prabhu Joseph

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-9820-001.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925179#comment-16925179
 ] 

Prabhu Joseph commented on YARN-9820:
-

As per the YARN-9438 patch, looks we can ignore the {{APP_UPDATE_SAVED}} event 
when the app is in {{ACCEPTED}} state. 

[~jhung] [~haibo.chen] Can you review this Jira when you get time. Thanks.

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-9820-001.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9820:

Attachment: YARN-9820-001.patch

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-9820-001.patch
>
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9822) TimelineCollectorWebService#putEntities blocked when ATSV2 HBase is down.

2019-09-08 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-9822:
---

 Summary: TimelineCollectorWebService#putEntities blocked when 
ATSV2 HBase is down.
 Key: YARN-9822
 URL: https://issues.apache.org/jira/browse/YARN-9822
 Project: Hadoop YARN
  Issue Type: Bug
  Components: ATSv2
Affects Versions: 3.2.0, 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


TimelineCollectorWebService#putEntities blocked when ATSV2 HBase is down. 
YARN-9374 prevents the threads getting blocked when it has already identified 
that Hbase down before accessing Hbase. TimelineCollector can check if the 
Writer Backend is up or down before locking the writer.

{code}
  synchronized (writer) {
  response = writeTimelineEntities(entities, callerUgi);
  flushBufferedTimelineEntities();
}
{code}


{code}
"qtp183259297-80" #80 daemon prio=5 os_prio=0 tid=0x7f5f567fd000 nid=0x5fbb 
waiting for monitor entry [0x7f5f236d4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollector.putEntities(TimelineCollector.java:164)
- waiting to lock <0x0006c7c05770> (a 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl)
at 
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorWebService.putEntities(TimelineCollectorWebService.java:186)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:304)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1624)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)

[jira] [Created] (YARN-9821) NM hangs at serviceStop when ATSV2 Backend Hbase is Down

2019-09-08 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-9821:
---

 Summary: NM hangs at serviceStop when ATSV2 Backend Hbase is Down 
 Key: YARN-9821
 URL: https://issues.apache.org/jira/browse/YARN-9821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: ATSv2
Affects Versions: 3.2.0, 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


NM hangs at serviceStop when ATSV2 Backend Hbase is Down.

{code}

"Thread-197" #302 prio=5 os_prio=0 tid=0x7f5f389ba000 nid=0x631d waiting 
for monitor entry [0x7f5f1f29b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:249)
- waiting to lock <0x0006c834d148> (a 
org.apache.hadoop.hbase.client.BufferedMutatorImpl)
at 
org.apache.hadoop.yarn.server.timelineservice.storage.common.TypedBufferedMutator.close(TypedBufferedMutator.java:62)
at 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceStop(HBaseTimelineWriterImpl.java:636)
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
- locked <0x0006c7c05808> (a java.lang.Object)
at 
org.apache.hadoop.service.AbstractService.close(AbstractService.java:247)
at 
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceStop(TimelineCollectorManager.java:244)
at 
org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceStop(NodeTimelineCollectorManager.java:164)
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
- locked <0x0006c7c05890> (a java.lang.Object)
at 
org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceStop(PerNodeTimelineCollectorsAuxService.java:113)
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
- locked <0x0006c7c058f8> (a java.lang.Object)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStop(AuxServices.java:330)
- locked <0x0006c7c23400> (a java.util.Collections$SynchronizedMap)
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
- locked <0x0006c7c059a8> (a java.lang.Object)
at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
at 
org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
at 
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStop(ContainerManagerImpl.java:720)
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
- locked <0x0006c7c05a98> (a java.lang.Object)
at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
at 
org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
at 
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:526)
at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
- locked <0x0006c7c05c88> (a java.lang.Object)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager$1.run(NodeManager.java:552)


"qtp183259297-76" #76 daemon prio=5 os_prio=0 tid=0x7f5f567ed000 nid=0x5fb7 
in Object.wait() [0x7f5f23ad7000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:460)
at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
at 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForSpecificCompletedTask(ResultBoundedCompletionService.java:258)
- locked <0x000784ee8220> (a 
[Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;)
at 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService.pollForFirstSuccessfullyCompletedTask(ResultBoundedCompletionService.java:214)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:228)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
at 
org.apache.hadoop.hbase.client.ClientScanner.c

[jira] [Updated] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Rohith Sharma K S (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-9820:

Target Version/s:   (was: 3.2.2)

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Priority: Critical
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Rohith Sharma K S (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-9820:

Affects Version/s: (was: 3.2.1)

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Priority: Critical
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Rohith Sharma K S (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-9820:

Target Version/s: 3.2.2

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Rohith Sharma K S
>Priority: Critical
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Rohith Sharma K S (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925135#comment-16925135
 ] 

Rohith Sharma K S commented on YARN-9820:
-

YARN-9438 cause triggering update event immediately after app submit. It is 
expecting event, then this need to be ignored in RMAppImpl. 
cc:/ [~jhung] [~haibochen]

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Rohith Sharma K S
>Priority: Critical
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Rohith Sharma K S (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-9820:

Affects Version/s: 3.2.1

> RM logs InvalidStateTransitionException when app is submitted
> -
>
> Key: YARN-9820
> URL: https://issues.apache.org/jira/browse/YARN-9820
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Rohith Sharma K S
>Priority: Critical
>
> It is observed that RM logs InvalidStateTransitionException. Not sure what is 
> the impact but its better to handle it. 
> {noformat}
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED 
> on event = LAUNCHED
> 2019-09-08 12:40:46,327 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
> launch time for applicationId: application_1567926390667_0001, attemptId: 
> appattempt_1567926390667_0001_01launchTime: 1567926646327
> 2019-09-08 12:40:46,328 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
> info for app: application_1567926390667_0001
> 2019-09-08 12:40:46,332 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
> application_1567926390667_0001 can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> APP_UPDATE_SAVED at ACCEPTED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9820) RM logs InvalidStateTransitionException when app is submitted

2019-09-08 Thread Rohith Sharma K S (Jira)
Rohith Sharma K S created YARN-9820:
---

 Summary: RM logs InvalidStateTransitionException when app is 
submitted
 Key: YARN-9820
 URL: https://issues.apache.org/jira/browse/YARN-9820
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Rohith Sharma K S


It is observed that RM logs InvalidStateTransitionException. Not sure what is 
the impact but its better to handle it. 

{noformat}
2019-09-08 12:40:46,327 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1567926390667_0001_01 State change from ALLOCATED to LAUNCHED on 
event = LAUNCHED
2019-09-08 12:40:46,327 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: update the 
launch time for applicationId: application_1567926390667_0001, attemptId: 
appattempt_1567926390667_0001_01launchTime: 1567926646327
2019-09-08 12:40:46,328 INFO 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
info for app: application_1567926390667_0001
2019-09-08 12:40:46,332 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: App: 
application_1567926390667_0001 can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
APP_UPDATE_SAVED at ACCEPTED
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:881)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1030)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:1014)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
at java.lang.Thread.run(Thread.java:748)
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9605) Add ZkConfiguredFailoverProxyProvider for RM HA

2019-09-08 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925103#comment-16925103
 ] 

zhoukang commented on YARN-9605:


[~Prabhu Joseph][~tangzhankun]Could help review this patch plz?Thanks a lot

> Add ZkConfiguredFailoverProxyProvider for RM HA
> ---
>
> Key: YARN-9605
> URL: https://issues.apache.org/jira/browse/YARN-9605
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-9605.001.patch
>
>
> In this issue, i will track a new feature to support 
> ZkConfiguredFailoverProxyProvider for RM HA



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9612) Support using ip to register NodeID

2019-09-08 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925102#comment-16925102
 ] 

zhoukang commented on YARN-9612:


IIUC.
The solution is that add service name for each pod? [~tangzhankun]Which i think 
is not very elegant.

> Support using ip to register NodeID
> ---
>
> Key: YARN-9612
> URL: https://issues.apache.org/jira/browse/YARN-9612
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: zhoukang
>Priority: Major
>
> In the environment like k8s. We should support ip when register NodeID with 
> RM since the hostname will be podName which can not be be resolved by DNS of 
> k8s



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9739) appsTableData in AppsBlock may cause OOM

2019-09-08 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925100#comment-16925100
 ] 

zhoukang commented on YARN-9739:


Any suggestion for this  [~tangzhankun]Our current implementation is just cache 
for that which i think is not elegant enough

> appsTableData in AppsBlock may cause OOM
> 
>
> Key: YARN-9739
> URL: https://issues.apache.org/jira/browse/YARN-9739
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: zhoukang
>Priority: Major
> Attachments: heap0.png, heap1.png, stack.png
>
>
> If we have many users list the applications, it may cause RM OOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption

2019-09-08 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925101#comment-16925101
 ] 

zhoukang commented on YARN-9537:


Yes, i will. [~yufeigu]thanks a lot!

> Add configuration to disable AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.2.0, 3.1.2
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-9537.001.patch
>
>
> In this issue, i will add a configuration to support disable AM preemption.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9748) Allow capacity-scheduler configuration on HDFS

2019-09-08 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925099#comment-16925099
 ] 

zhoukang commented on YARN-9748:


Sorry for late reply  [~Prabhu Joseph]I think the title is miseading.
what we want in our production cluster is auto-reload feature, maybe i should 
change the title ? 

> Allow capacity-scheduler configuration on HDFS
> --
>
> Key: YARN-9748
> URL: https://issues.apache.org/jira/browse/YARN-9748
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.2
>Reporter: zhoukang
>Assignee: Prabhu Joseph
>Priority: Major
>
> Improvement:
> Support auto reload from hdfs



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9748) Allow capacity-scheduler configuration on HDFS and support reload from hdfs

2019-09-08 Thread zhoukang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated YARN-9748:
---
Summary: Allow capacity-scheduler configuration on HDFS and support reload 
from hdfs  (was: Allow capacity-scheduler configuration on HDFS)

> Allow capacity-scheduler configuration on HDFS and support reload from hdfs
> ---
>
> Key: YARN-9748
> URL: https://issues.apache.org/jira/browse/YARN-9748
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.2
>Reporter: zhoukang
>Assignee: Prabhu Joseph
>Priority: Major
>
> Improvement:
> Support auto reload from hdfs



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9748) Allow capacity-scheduler configuration on HDFS and support reload from HDFS

2019-09-08 Thread zhoukang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated YARN-9748:
---
Summary: Allow capacity-scheduler configuration on HDFS and support reload 
from HDFS  (was: Allow capacity-scheduler configuration on HDFS and support 
reload from hdfs)

> Allow capacity-scheduler configuration on HDFS and support reload from HDFS
> ---
>
> Key: YARN-9748
> URL: https://issues.apache.org/jira/browse/YARN-9748
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.2
>Reporter: zhoukang
>Assignee: Prabhu Joseph
>Priority: Major
>
> Improvement:
> Support auto reload from hdfs



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8199) Logging fileSize of log files under NM Local Dir

2019-09-08 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925094#comment-16925094
 ] 

Prabhu Joseph commented on YARN-8199:
-

[~rohithsharma] Below is the commit id. We missed the jira number in commit 
message.

{code}
commit 54ac80176e8487b7a18cd9e16a11efa289d0b7df
Author: Szilard Nemeth 
Date:   Fri Aug 2 13:38:06 2019 +0200

Logging fileSize of log files under NM Local Dir. Contributed by Prabhu 
Joseph
{code}

> Logging fileSize of log files under NM Local Dir
> 
>
> Key: YARN-8199
> URL: https://issues.apache.org/jira/browse/YARN-8199
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: supportability
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: 0001-YARN-8199.patch, 0002-YARN-8199.patch, 
> YARN-8199-003.patch, YARN-8199-004.patch, YARN-8199-branch-3.1.001.patch, 
> YARN-8199-branch-3.2.001.patch
>
>
> Logging fileSize of log files like syslog, stderr, stdout under NM Local Dir 
> by NodeManager before the cleanup will help to find the application which has 
> written too verbose.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9787) Typo in analysesErrorMsg

2019-09-08 Thread kevin su (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925092#comment-16925092
 ] 

kevin su commented on YARN-9787:


Thanks for [~surendrasingh] and [~jojochuang] for the review and commit 

> Typo in analysesErrorMsg
> 
>
> Key: YARN-9787
> URL: https://issues.apache.org/jira/browse/YARN-9787
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: David Mollitor
>Assignee: kevin su
>Priority: Trivial
>  Labels: newbie, noob
> Fix For: 3.3.0
>
> Attachments: YARN-9787.001.patch
>
>
> {code:java}
>   analysis.append("Please check whether your etc/hadoop/mapred-site.xml "
>   + "contains the below configuration:\n");
> {code}
> I think it should be {{/etc/hadoop/mapred-site.xml}}
> https://github.com/apache/hadoop/blob/2064ca015d1584263aac0cc20c60b925a3aff612/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java#L788-L789



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org