[jira] [Updated] (YARN-4312) TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4312:
-
Fix Version/s: 2.8.0

> TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of 
> the test cases time out 
> 
>
> Key: YARN-4312
> URL: https://issues.apache.org/jira/browse/YARN-4312
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1, 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.7.2, 2.6.3
>
> Attachments: YARN-4312-branch-2.6.01.patch, 
> YARN-4312-branch-2.7.01.patch
>
>
> These timeouts happen because we do ZK sync operation on RM startup after 
> YARN-3798 which delays RM startup a bit making the timeouts of 5 s. too small 
> for a couple of tests in TestSubmitApplicationWithRMHA.
> {noformat}
> testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.162 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1001)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:303)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:191)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA.testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(TestSubmitApplicationWithRMHA.java:234)
> 
> testHandleRMHADuringSubmitApplicationCallWithoutSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.146 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> 

[jira] [Updated] (YARN-4312) TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out

2015-11-05 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4312:
--
Fix Version/s: 2.6.3

Cherry-picked the fix to branch-2.6. Thanks [~varun_saxena]!

> TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of 
> the test cases time out 
> 
>
> Key: YARN-4312
> URL: https://issues.apache.org/jira/browse/YARN-4312
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1, 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Fix For: 2.7.2, 2.6.3
>
> Attachments: YARN-4312-branch-2.6.01.patch, 
> YARN-4312-branch-2.7.01.patch
>
>
> These timeouts happen because we do ZK sync operation on RM startup after 
> YARN-3798 which delays RM startup a bit making the timeouts of 5 s. too small 
> for a couple of tests in TestSubmitApplicationWithRMHA.
> {noformat}
> testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.162 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1001)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:303)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:191)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA.testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(TestSubmitApplicationWithRMHA.java:234)
> 
> testHandleRMHADuringSubmitApplicationCallWithoutSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.146 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> 

[jira] [Updated] (YARN-4312) TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out

2015-11-02 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4312:
--
Target Version/s: 2.7.2, 2.6.3  (was: 2.7.2)

> TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of 
> the test cases time out 
> 
>
> Key: YARN-4312
> URL: https://issues.apache.org/jira/browse/YARN-4312
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1, 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Fix For: 2.7.2
>
> Attachments: YARN-4312-branch-2.6.01.patch, 
> YARN-4312-branch-2.7.01.patch
>
>
> These timeouts happen because we do ZK sync operation on RM startup after 
> YARN-3798 which delays RM startup a bit making the timeouts of 5 s. too small 
> for a couple of tests in TestSubmitApplicationWithRMHA.
> {noformat}
> testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.162 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1001)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:303)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:191)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA.testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(TestSubmitApplicationWithRMHA.java:234)
> 
> testHandleRMHADuringSubmitApplicationCallWithoutSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.146 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> 

[jira] [Updated] (YARN-4312) TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out

2015-11-02 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4312:
--
Target Version/s: 2.7.2  (was: 2.7.2, 2.6.2)

Targeting 2.6.3 now that 2.6.2 has shipped.

> TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of 
> the test cases time out 
> 
>
> Key: YARN-4312
> URL: https://issues.apache.org/jira/browse/YARN-4312
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1, 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Fix For: 2.7.2
>
> Attachments: YARN-4312-branch-2.6.01.patch, 
> YARN-4312-branch-2.7.01.patch
>
>
> These timeouts happen because we do ZK sync operation on RM startup after 
> YARN-3798 which delays RM startup a bit making the timeouts of 5 s. too small 
> for a couple of tests in TestSubmitApplicationWithRMHA.
> {noformat}
> testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.162 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1001)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:303)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:191)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA.testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(TestSubmitApplicationWithRMHA.java:234)
> 
> testHandleRMHADuringSubmitApplicationCallWithoutSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.146 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> 

[jira] [Updated] (YARN-4312) TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out

2015-10-30 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-4312:
-
Hadoop Flags: Reviewed

> TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of 
> the test cases time out 
> 
>
> Key: YARN-4312
> URL: https://issues.apache.org/jira/browse/YARN-4312
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1, 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4312-branch-2.6.01.patch, 
> YARN-4312-branch-2.7.01.patch
>
>
> These timeouts happen because we do ZK sync operation on RM startup after 
> YARN-3798 which delays RM startup a bit making the timeouts of 5 s. too small 
> for a couple of tests in TestSubmitApplicationWithRMHA.
> {noformat}
> testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.162 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1001)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:303)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:191)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA.testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(TestSubmitApplicationWithRMHA.java:234)
> 
> testHandleRMHADuringSubmitApplicationCallWithoutSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.146 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> 

[jira] [Updated] (YARN-4312) TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out

2015-10-28 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4312:
---
Description: 
These timeouts happen because we do ZK sync operation on RM startup after 
YARN-3798 which delays RM startup a bit making the timeouts of 5 s. too small 
for a couple of tests in TestSubmitApplicationWithRMHA.

{noformat}
testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
  Time elapsed: 5.162 sec  <<< ERROR!
java.lang.Exception: test timed out after 5000 milliseconds
at sun.misc.Unsafe.park(Native Method)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1001)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:303)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:191)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:111)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA.testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(TestSubmitApplicationWithRMHA.java:234)



testHandleRMHADuringSubmitApplicationCallWithoutSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
  Time elapsed: 5.146 sec  <<< ERROR!
java.lang.Exception: test timed out after 5000 milliseconds
at sun.misc.Unsafe.park(Native Method)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 

[jira] [Updated] (YARN-4312) TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out

2015-10-28 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4312:
---
Attachment: YARN-4312-branch-2.7.01.patch

> TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of 
> the test cases time out 
> 
>
> Key: YARN-4312
> URL: https://issues.apache.org/jira/browse/YARN-4312
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1, 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4312-branch-2.7.01.patch
>
>
> {noformat}
> testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.162 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1001)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:303)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:191)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA.testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(TestSubmitApplicationWithRMHA.java:234)
> 
> testHandleRMHADuringSubmitApplicationCallWithoutSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.146 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
>   at 
> 

[jira] [Updated] (YARN-4312) TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out

2015-10-28 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4312:
---
Attachment: YARN-4312-branch-2.6.01.patch

> TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of 
> the test cases time out 
> 
>
> Key: YARN-4312
> URL: https://issues.apache.org/jira/browse/YARN-4312
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1, 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4312-branch-2.6.01.patch, 
> YARN-4312-branch-2.7.01.patch
>
>
> These timeouts happen because we do ZK sync operation on RM startup after 
> YARN-3798 which delays RM startup a bit making the timeouts of 5 s. too small 
> for a couple of tests in TestSubmitApplicationWithRMHA.
> {noformat}
> testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.162 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:964)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1005)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1001)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1001)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:303)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:191)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMHATestBase.startRMs(RMHATestBase.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA.testHandleRMHADuringSubmitApplicationCallWithSavedApplicationState(TestSubmitApplicationWithRMHA.java:234)
> 
> testHandleRMHADuringSubmitApplicationCallWithoutSavedApplicationState(org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA)
>   Time elapsed: 5.146 sec  <<< ERROR!
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.syncInternal(ZKRMStateStore.java:944)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.startInternal(ZKRMStateStore.java:320)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceStart(RMStateStore.java:562)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:559)
>