[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112600#comment-15112600 ] Cristian Barca commented on YARN-2559: -- when submitting Unmanaged Application Masters instead > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Fix For: 2.6.0 > > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch, > YARN-2559.4.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112596#comment-15112596 ] Cristian Barca commented on YARN-2559: -- Yes, saw it after a bit of digging. Correct, is supposed to be fixed by YARN 4452 -- though the title is a bit vague (NPE when submitting unmanaged Application). I cannot confirm the fix since the Hadoop version I am working on is 2.7.0. The workaround I have in mind to overcome this issue in older versions is to just disable the GHS property, yarn.resourcemanager.system-metrics-publisher.enabled=false. Any other ideas how to do it w/o affecting the whole system, i.e. something more specific to an AppMaster? Will confirm the fix on my next upgrade to 2.8.0. I've added related links to those 2 guys here as well. Thanks! > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Fix For: 2.6.0 > > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch, > YARN-2559.4.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112251#comment-15112251 ] Naganarasimha G R commented on YARN-2559: - Hi [~cristib], Can you please share the NPE stack trace ? We just fixed in YARN-4452 and YARN-4623 recently in trunk, 2.7.3 and 2.6.4. May be is it the same issue you are facing ? > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Fix For: 2.6.0 > > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch, > YARN-2559.4.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112193#comment-15112193 ] Cristian Barca commented on YARN-2559: -- This is still not fixed for unmanaged AppMasters. > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Fix For: 2.6.0 > > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch, > YARN-2559.4.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138955#comment-14138955 ] Hudson commented on YARN-2559: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1875 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1875/]) YARN-2559. Fixed NPE in SystemMetricsPublisher when retrieving FinalApplicationStatus. Contributed by Zhijie Shen (jianhe: rev ee21b13cbd4654d7181306404174329f12193613) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Fix For: 2.6.0 > > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch, > YARN-2559.4.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138942#comment-14138942 ] Hudson commented on YARN-2559: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1900 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1900/]) YARN-2559. Fixed NPE in SystemMetricsPublisher when retrieving FinalApplicationStatus. Contributed by Zhijie Shen (jianhe: rev ee21b13cbd4654d7181306404174329f12193613) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * hadoop-yarn-project/CHANGES.txt > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Fix For: 2.6.0 > > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch, > YARN-2559.4.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138814#comment-14138814 ] Hudson commented on YARN-2559: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #684 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/684/]) YARN-2559. Fixed NPE in SystemMetricsPublisher when retrieving FinalApplicationStatus. Contributed by Zhijie Shen (jianhe: rev ee21b13cbd4654d7181306404174329f12193613) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Fix For: 2.6.0 > > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch, > YARN-2559.4.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138401#comment-14138401 ] Hadoop QA commented on YARN-2559: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669586/YARN-2559.4.patch against trunk revision 47e5e19. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5016//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5016//console This message is automatically generated. > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch, > YARN-2559.4.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138306#comment-14138306 ] Hadoop QA commented on YARN-2559: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669569/YARN-2559.3.patch against trunk revision 123f20d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5012//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5012//console This message is automatically generated. > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Attachments: YARN-2559.1.patch, YARN-2559.2.patch, YARN-2559.3.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138162#comment-14138162 ] Jian He commented on YARN-2559: --- looks good overall, We may just call RMApp#getFinalApplicationStatus here? {code} (appAttempt.getFinalApplicationStatus() == null ? RMServerUtils.createFinalApplicationStatus(appState) : appAttempt.getFinalApplicationStatus() {code} > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Attachments: YARN-2559.1.patch, YARN-2559.2.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138003#comment-14138003 ] Hadoop QA commented on YARN-2559: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669496/YARN-2559.2.patch against trunk revision e3803d0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5002//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5002//console This message is automatically generated. > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Attachments: YARN-2559.1.patch, YARN-2559.2.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137636#comment-14137636 ] Jian He commented on YARN-2559: --- To be consistent with the FinalApplicationStatus exposed on RM web UI and CLI, we may publish UNDEFINED state as well in case finalStatus is unavailable ? > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Attachments: YARN-2559.1.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136622#comment-14136622 ] Zhijie Shen commented on YARN-2559: --- The test failure is not related. File a ticket for it: YARN-2564. > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Attachments: YARN-2559.1.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136339#comment-14136339 ] Hadoop QA commented on YARN-2559: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12669199/YARN-2559.1.patch against trunk revision 56119fe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4978//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4978//console This message is automatically generated. > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher > -- > > Key: YARN-2559 > URL: https://issues.apache.org/jira/browse/YARN-2559 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.6.0 > Environment: Generice History Service is enabled in Timelineserver > with > yarn.resourcemanager.system-metrics-publisher.enabled=true > So that ResourceManager should Timeline Store for recording application > history information >Reporter: Karam Singh >Assignee: Zhijie Shen > Attachments: YARN-2559.1.patch > > > ResourceManager sometime become un-responsive due to NPE in > SystemMetricsPublisher -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2559) ResourceManager sometime become un-responsive due to NPE in SystemMetricsPublisher
[ https://issues.apache.org/jira/browse/YARN-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135357#comment-14135357 ] Karam Singh commented on YARN-2559: --- Using Timeline Store for AHS/GHS {code} yarn.timeline-service.enabled=true yarn.timeline-service.hostname= yarn.timeline-service.address=:10200 yarn.timeline-service.webapp.address=<:8188 yarn.timeline-service.handler-thread-count=10 yarn.timeline-service.ttl-enable=true yarn.timeline-service.ttl-ms=60480 yarn.timeline-service.leveldb-timeline-store.path=/tmp/timeline yarn.timeline-service.generic-application-history.enabled=true yarn.timeline-service.http-authentication.type=simple yarn.resourcemanager.system-metrics-publisher.enabled=true yarn.resourcemanager.system-metrics-publisher.dispatcher.pool-size=10 yarn.timeline-service.generic-application-history.store-class='' {code} Restart ATS (TimelineServer) Restart ResourceManager (RM) After running some application. ResourceManager became unresponsive due to NPE in SystemMetricsPublisher {code} 2014-09-15 05:01:21,279 FATAL event.AsyncDispatcher (AsyncDispatcher.java:dispatch(179)) - Error in dispatcher thread java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.publishAppAttemptFinishedEvent(SystemMetricsPublisher.java:318) at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.handleSystemMetricsEvent(SystemMetricsPublisher.java:209) at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher$ForwardingEventHandler.handle(SystemMetricsPublisher.java:431) at org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher$ForwardingEventHandler.handle(SystemMetricsPublisher.java:426) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:745) 2014-09-15 05:01:21,279 INFO event.AsyncDispatcher (AsyncDispatcher.java:dispatch(184)) - Exiting, bbye.. {code} But instead of exiting we into infinite loop for flushingWaiting for AsyncDispatcher to drain. Following are some more log information {code} 2014-09-15 05:01:21,345 INFO ahs.RMApplicationHistoryWriter (RMApplicationHistoryWriter.java:handleWritingApplicationHistoryEvent(157)) - Stored the finish data of application application_1410782288602_0004 2014-09-15 05:01:21,348 ERROR ahs.RMApplicationHistoryWriter (RMApplicationHistoryWriter.java:handleWritingApplicationHistoryEvent(211)) - Error when storing the finish data of container container_1410782288602_0004_01_01 2014-09-15 05:01:21,348 ERROR ahs.RMApplicationHistoryWriter (RMApplicationHistoryWriter.java:handleWritingApplicationHistoryEvent(211)) - Error when storing the finish data of container container_1410782288602_0004_01_02 2014-09-15 05:01:21,451 INFO ipc.Server (Server.java:stop(2437)) - Stopping server on 8032 2014-09-15 05:01:21,462 INFO ipc.Server (Server.java:run(832)) - Stopping IPC Server Responder 2014-09-15 05:01:21,462 INFO ipc.Server (Server.java:stop(2437)) - Stopping server on 8033 2014-09-15 05:01:21,470 INFO ipc.Server (Server.java:run(832)) - Stopping IPC Server Responder 2014-09-15 05:01:21,471 INFO resourcemanager.ResourceManager (ResourceManager.java:transitionToStandby(994)) - Transitioning to standby state 2014-09-15 05:01:21,471 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(201)) - Stopping ResourceManager metrics system... 2014-09-15 05:01:21,473 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(207)) - ResourceManager metrics system stopped. 2014-09-15 05:01:21,473 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(584)) - ResourceManager metrics system shutdown complete. 2014-09-15 05:01:21,474 INFO event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(138)) - AsyncDispatcher is draining to stop, igonring any new events. 2014-09-15 05:01:21,480 INFO ipc.Server (Server.java:stop(2437)) - Stopping server on 8030 2014-09-15 05:01:21,490 INFO ipc.Server (Server.java:run(832)) - Stopping IPC Server Responder 2014-09-15 05:01:21,500 INFO ipc.Server (Server.java:run(706)) - Stopping IPC Server listener on 8031 2014-09-15 05:01:21,501 INFO ipc.Server (Server.java:run(832)) - Stopping IPC Server Responder 2014-09-15 05:01:21,502 INFO util.AbstractLivelinessMonitor (AbstractLivelinessMonitor.java:run(127)) - NMLivelinessMonitor thread interrupted 2014-09-15 05:01:21,502 ERROR resourcemanager.ResourceManager (ResourceManager.java:run(607)) - Returning, interrupted : java.lang.InterruptedException 2014-09-15 05:01:21,504 DEBUG service.CompositeService (CompositeService.java:stop(151)) - Stopping service #9: Service Dispatcher in state Dispatcher: STARTED 2014-09-15 05:01:21,50