[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487300#comment-15487300 ] Eric Badger commented on YARN-1468: --- [~templedf], there's some additional analysis for this failure in YARN-5416 that might explain what you're seeing with this test. I think that we should close one of the two Jiras as a dup of the other. > TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. > > > Key: YARN-1468 > URL: https://issues.apache.org/jira/browse/YARN-1468 > Project: Hadoop YARN > Issue Type: Test > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Log is as following: > {code} > Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) > Time elapsed: 44.197 sec <<< FAILURE! > junit.framework.AssertionFailedError: AppAttempt state is not correct > (timedout) expected: but was: > at junit.framework.Assert.fail(Assert.java:50) > at junit.framework.Assert.failNotEquals(Assert.java:287) > at junit.framework.Assert.assertEquals(Assert.java:67) > at > org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485549#comment-15485549 ] Daniel Templeton commented on YARN-1468: For what it's worth, I'm seeing the same issue as [~mitdesai]. I'm going to take a look this week. > TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. > > > Key: YARN-1468 > URL: https://issues.apache.org/jira/browse/YARN-1468 > Project: Hadoop YARN > Issue Type: Test > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Log is as following: > {code} > Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) > Time elapsed: 44.197 sec <<< FAILURE! > junit.framework.AssertionFailedError: AppAttempt state is not correct > (timedout) expected: but was: > at junit.framework.Assert.fail(Assert.java:50) > at junit.framework.Assert.failNotEquals(Assert.java:287) > at junit.framework.Assert.assertEquals(Assert.java:67) > at > org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396233#comment-15396233 ] Eric Badger commented on YARN-1468: --- [~djp], you are correct. When I encountered the error in a test failure it was in the place as I explained it above. However, that failure is a different one than the stack trace that [~mitdesai] included. I should've included my own stack trace to make that point clear. I think that both point to issues, however. I've seen this test fail in a multitude of different ways. > TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. > > > Key: YARN-1468 > URL: https://issues.apache.org/jira/browse/YARN-1468 > Project: Hadoop YARN > Issue Type: Test > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Log is as following: > {code} > Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) > Time elapsed: 44.197 sec <<< FAILURE! > junit.framework.AssertionFailedError: AppAttempt state is not correct > (timedout) expected: but was: > at junit.framework.Assert.fail(Assert.java:50) > at junit.framework.Assert.failNotEquals(Assert.java:287) > at junit.framework.Assert.assertEquals(Assert.java:67) > at > org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395946#comment-15395946 ] Junping Du commented on YARN-1468: -- Hi [~ebadger], that's indeed a nice catch! I think we should fix this test issue. However, this seems not to be the same issue as exception stack so far. Do I miss something? > TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. > > > Key: YARN-1468 > URL: https://issues.apache.org/jira/browse/YARN-1468 > Project: Hadoop YARN > Issue Type: Test > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Log is as following: > {code} > Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) > Time elapsed: 44.197 sec <<< FAILURE! > junit.framework.AssertionFailedError: AppAttempt state is not correct > (timedout) expected: but was: > at junit.framework.Assert.fail(Assert.java:50) > at junit.framework.Assert.failNotEquals(Assert.java:287) > at junit.framework.Assert.assertEquals(Assert.java:67) > at > org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310976#comment-15310976 ] Eric Badger commented on YARN-1468: --- [~mitdesai], I saw this test failing in the same way that you described above. I took a look at the test and I either don't understand the meaning of one of the lines or it's a bug. The following piece of code (minus the assertEquals) was added by [YARN-1493|https://issues.apache.org/jira/browse/YARN-1493] and doesn't make sense to me. Why are we checking the size against 2 when we are checking it against 4 immediately after? In my local tests, this loop times out once timeoutSecs >= 40 since rmApp.getAttempts.size() is equal to 4 the whole time. This leads me to believe that the assert failure would occur when this loop is executed and the size is actually equal to 2 initially. That way it would break out of the loop early and only get up to 3 (or stay at 2) before the assertEquals against 4 is executed. {noformat} // wait for the attempt to be created. int timeoutSecs = 0; while (rmApp.getAppAttempts().size() != 2 && timeoutSecs++ < 40) { Thread.sleep(200); } Assert.assertEquals(4, rmApp.getAppAttempts().size()); {noformat} I think changing ".size() != 2" to ".size() != 4" will fix this race in the test. Thoughts? cc [~djp] > TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. > > > Key: YARN-1468 > URL: https://issues.apache.org/jira/browse/YARN-1468 > Project: Hadoop YARN > Issue Type: Test > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Log is as following: > {code} > Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) > Time elapsed: 44.197 sec <<< FAILURE! > junit.framework.AssertionFailedError: AppAttempt state is not correct > (timedout) expected: but was: > at junit.framework.Assert.fail(Assert.java:50) > at junit.framework.Assert.failNotEquals(Assert.java:287) > at junit.framework.Assert.assertEquals(Assert.java:67) > at > org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972197#comment-13972197 ] Junping Du commented on YARN-1468: -- TestRMRestart seems to be failed with different error now. Will work on it. TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. Key: YARN-1468 URL: https://issues.apache.org/jira/browse/YARN-1468 Project: Hadoop YARN Issue Type: Test Components: resourcemanager Reporter: Junping Du Priority: Critical Log is as following: {code} Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 44.197 sec FAILURE! junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:ALLOCATED but was:SCHEDULED at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923797#comment-13923797 ] Tsuyoshi OZAWA commented on YARN-1468: -- Today I met new failure pattern like this: {code} junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:ALLOCATED but was:FAILED at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:343) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:896) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:522) {code} TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. Key: YARN-1468 URL: https://issues.apache.org/jira/browse/YARN-1468 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Junping Du Priority: Critical Log is as following: {code} Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 44.197 sec FAILURE! junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:ALLOCATED but was:SCHEDULED at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837759#comment-13837759 ] Junping Du commented on YARN-1468: -- Thanks [~ozawa]. TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. Key: YARN-1468 URL: https://issues.apache.org/jira/browse/YARN-1468 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Junping Du Priority: Critical Log is as following: {code} Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 44.197 sec FAILURE! junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:ALLOCATED but was:SCHEDULED at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837312#comment-13837312 ] Tsuyoshi OZAWA commented on YARN-1468: -- Maybe this is a timing bug: I cannot reproduce the problem in my local environment. TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. Key: YARN-1468 URL: https://issues.apache.org/jira/browse/YARN-1468 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Junping Du Priority: Critical Log is as following: {code} Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 44.197 sec FAILURE! junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:ALLOCATED but was:SCHEDULED at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837363#comment-13837363 ] Junping Du commented on YARN-1468: -- Hi [~ozawa], the log you specified here is addressed in YARN-1454. Please see if patch there fix the problem you met. This issue I met looks different, and I need to take a close look at it. TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. Key: YARN-1468 URL: https://issues.apache.org/jira/browse/YARN-1468 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Junping Du Priority: Critical Log is as following: {code} Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 44.197 sec FAILURE! junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:ALLOCATED but was:SCHEDULED at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) {code} Another log is as following: {code} Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 143.009 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart testRMDelegationTokenRestoredOnRMRestart(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 2.077 sec FAILURE! java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMDelegationTokenRestoredOnRMRestart(TestRMRestart.java:1259) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)