[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807770#comment-16807770 ] Sunil Govindan commented on YARN-4901: -- +1. Lets get this in. > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-4901-001.patch > > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807557#comment-16807557 ] Peter Bacsko commented on YARN-4901: [~sunilg] can you review & commit this patch? > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-4901-001.patch > > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806326#comment-16806326 ] Wilfred Spiegelenburg commented on YARN-4901: - I have run the test over 2500 times and cannot get the failure to reproduce. I do see some weird things in my local run which could explain the failure. Opened a new jira for this:[YARN-9431|https://issues.apache.org/jira/browse/YARN-9431] > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-4901-001.patch > > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804999#comment-16804999 ] Peter Bacsko commented on YARN-4901: OK, it turned out that the failed unit test is flaky and was introduced by YARN-8967. > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-4901-001.patch > > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802910#comment-16802910 ] Peter Bacsko commented on YARN-4901: Failed unit test passed locally several times. > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-4901-001.patch > > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802864#comment-16802864 ] Hadoop QA commented on YARN-4901: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 5s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}142m 50s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.fair.TestAppRunnability | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-4901 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12963879/YARN-4901-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 09cdd948c628 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b226958 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/23816/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23816/testReport/ | | Max. process+thread count | 902 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802840#comment-16802840 ] Szilard Nemeth commented on YARN-4901: -- hi [~pbacsko]! As per our offline talk, there's no easy way to check if the DefaultMetricsSystem is already running so it's fine to invoke shutdown without checking any condition as it won't have any consequence. +1 (non-binding) > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-4901-001.patch > > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802731#comment-16802731 ] Peter Bacsko commented on YARN-4901: [~snemeth] [~templedf] could you look at this short patch? I also added {{DefaultMetricsSystem.shutdown()}} because it unregisters an object on JMX. If we don't do this, we might get: {noformat} org.apache.hadoop.metrics2.MetricsException: org.apache.hadoop.metrics2.MetricsException: Hadoop:service=ResourceManager,name=RMNMInfo already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:135) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newMBeanName(DefaultMetricsSystem.java:110) at org.apache.hadoop.metrics2.util.MBeans.getMBeanName(MBeans.java:123) at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:64) at org.apache.hadoop.yarn.server.resourcemanager.RMNMInfo.(RMNMInfo.java:59) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:749) ... {noformat} > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-4901-001.patch > > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802695#comment-16802695 ] Peter Bacsko commented on YARN-4901: It also affects {{TestApplicationLauncher.testAMLaunchAndCleanup}}. > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Peter Bacsko >Priority: Major > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4901) MockRM should clear the QueueMetrics when it starts
[ https://issues.apache.org/jira/browse/YARN-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227867#comment-15227867 ] Karthik Kambatla commented on YARN-4901: TestNMReconnect fails with FairScheduler because of this. > MockRM should clear the QueueMetrics when it starts > --- > > Key: YARN-4901 > URL: https://issues.apache.org/jira/browse/YARN-4901 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The {{ResourceManager}} rightly assumes that when it starts, it's starting > from naught. The {{MockRM}}, however, violates that assumption. For > example, in {{TestNMReconnect}}, each test method creates a new {{MockRM}} > instance. The {{QueueMetrics.queueMetrics}} field is static, which means > that when multiple {{MockRM}} instances are created, the {{QueueMetrics}} > bleed over. Having the MockRM clear the {{QueueMetrics}} when it starts > should resolve the issue. I haven't looked yet at scope to see how hard easy > that is to do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)