[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388831#comment-17388831 ] Szilard Nemeth commented on YARN-6272: -- Thanks [~gandras] for picking this up again. Patch LGTM, committed to trunk. Thanks [~bteke] for the review. Also thanks to [~prabhujoseph] who uploaded the initial patch and [~jlowe] for the possible solution proposals added above. [~gandras] Do you think we need to backport this to other older branches? > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Andras Gyori >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-6272-001.patch, YARN-6272.001.patch > > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388822#comment-17388822 ] Benjamin Teke commented on YARN-6272: - Thanks [~gandras] for picking this up and the patch. LGTM (non-binding). > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-6272-001.patch, YARN-6272.001.patch > > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388334#comment-17388334 ] Hadoop QA commented on YARN-6272: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 22s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 42s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 39s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 16m 31s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 40s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green}{color} | {color:green} the patch passed with
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380600#comment-17380600 ] Hadoop QA commented on YARN-6272: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 32s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 10s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 16m 1s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 42s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green}{color} | {color:green} the patch passed with
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380523#comment-17380523 ] Andras Gyori commented on YARN-6272: [~prabhujoseph] I can take this over, if you do not mind. Your approach works, just tweaked it a little bit. I do not think at this point, without a complete refactor, that using MockRM and MockNM is feasible. > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-6272-001.patch > > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844794#comment-16844794 ] Adam Antal commented on YARN-6272: -- Couldn't we force the container to be allocated in the same node using some locality rules? Because in that case when the NM is allocated elsewhere this test does not test anything. Also in this case it makes sense to use MockRM and MockNM but it would require excessive modification around that testcase. In your patch could you please add a safety check into {{doContainerResourceChange}} before asserting that the size of the updated containers is 1? Asserting that allocateAttempts < 1000 would be ideal. > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-6272-001.patch > > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826350#comment-16826350 ] Prabhu Joseph commented on YARN-6272: - Thanks [~giovanni.fumarola] for reviewing. The testcase heartbeats once with three NMs and expects the increase allocation will happen immediately. But it won't in cases where the allocation happens on some other NM. The allocation has to happen on the same NM as of actual container for which increase resource requested until then the request will be added back and will be processed only on subsequent node update. Heartbeating with only the NM where the container was allocated initially will not require any sleep. But MiniYarnCluster sends node update for all NMs thus the allocation will be random out of three NMs and so the testcase requires wait and retry till the container allocated on right NM out of three. The fix heartbeats with only the right NM. This will increase the possibility (even though MiniYarnCluster does nodeUpdate for all) and does wait and retry till the new increased container gets allocated on same NM. And validated the fix with multiple 500 iterations and did not face test failure. Without the fix, the testcase consistently fails within 50 iterations. The other way is to use MockRM and MockNM (as per above Jason's comment), have tried this and felt lot of changes. Let me know if this is not convincing, will test it with MockRM and MockNM . > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-6272-001.patch > > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826267#comment-16826267 ] Giovanni Matteo Fumarola commented on YARN-6272: Thanks [~Prabhu Joseph] for the patch. I am not a fan of Sleep instructions in unit tests. Can you explain the fix? > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-6272-001.patch > > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825705#comment-16825705 ] Prabhu Joseph commented on YARN-6272: - [~giovanni.fumarola] Could you please review this jira when you get time. This fixes intermittent failure of {{TestAMRMClient#testAMRMClientWithContainerResourceChange}}. Thanks. > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-6272-001.patch > > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825509#comment-16825509 ] Hadoop QA commented on YARN-6272: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 26m 20s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 72m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-6272 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12966939/YARN-6272-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c04708fe4810 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a703dae | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24019/testReport/ | | Max. process+thread count | 693 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24019/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently >
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825473#comment-16825473 ] Prabhu Joseph commented on YARN-6272: - There are two issues: 1. Test fails with below when both increase and decrease request processed in same heartbeat. {code} java.lang.AssertionError: expected:<1> but was:<2> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) {code} 2. Test fails when Updated Container is allocated on some other NM. The requests are added back and reallocated on subsequent node updates. {code} ContainerUpdateContext.java // Allocation happened on NM on the same host, but not on the NM // we need.. We need to signal that this container has to be released. // We also need to add these requests back.. to be reallocated. {code} Have added Wait and Retry with node update from the container's allocated node. The testcase fails consistently while running in a for loop. The fix works fine with allocateAttempts goes up to 50 in few runs. The issue was earlier fixed by YARN-5537 but then missed out during YARN-5221 feature merge. > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-6272-001.patch > > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824424#comment-16824424 ] Prabhu Joseph commented on YARN-6272: - Assigning to me, will work on this. > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Priority: Major > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974876#comment-15974876 ] Jason Lowe commented on YARN-6272: -- I've also seen this stacktrace on 2.8: {noformat} java.lang.AssertionError: expected:<1> but was:<2> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:920) at org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:813) {noformat} In the above case, it looks like the nodemanager happened to be heartbeating just as the app made the allocate call that asked for the increase request. In that case it was able to process both the increase and the decrease in the same heartbeat which the test explicitly does not expect. The test itself is very fragile. It's launching a full minicluster and uses hardcoded sleeps sprinkled in various places hoping asynchronous events have processed in the interim. That not only directly leads to flaky tests but slows down the unit test unnecessarily. Either the test needs to be made more tolerant of all the asynchronous stuff going on or ditch the minicluster and explicitly manage the cluster heartbeating. The former can be done by having the test poll via app alloc heartbeats until it gets all the responses it needs rather than assume which heartbeats will get which responses. The latter can be done by using MockRM, MockNM, and drain dispatchers so the test knows exactly which heartbeats have been completely processed and thus know which app alloc calls will get the appropriate responses. This latter approach would also eliminate the need for any arbitrary polling/sleeping intervals and speed up the test significantly. > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha3 >Reporter: Ray Chiang > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895007#comment-15895007 ] Ray Chiang commented on YARN-6272: -- Latest stack trace: testAMRMClientWithContainerResourceChange[0](org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) Time elapsed: 10.384 sec <<< FAILURE! java.lang.AssertionError: expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1127) at org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:1003) > TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently > - > > Key: YARN-6272 > URL: https://issues.apache.org/jira/browse/YARN-6272 > Project: Hadoop YARN > Issue Type: Test > Components: yarn >Affects Versions: 3.0.0-alpha3 >Reporter: Ray Chiang > > I'm seeing this unit test fail fairly often in trunk: > testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient) > Time elapsed: 5.113 sec <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org