[jira] [Updated] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently

2014-04-16 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1281:
-

Attachment: YARN-1281.1.patch

Thank you for comment, Mit. Assigned this issue to myself.

{code}
  @Override
  public ZooKeeper getNewZooKeeper()
  throws IOException, InterruptedException {
return createClient(watcher, hostPort, 100);
  }
{code}

I suspect that the timeout value is too short to connect ZK servers, because 
Jenkins servers can get overload sometimes. Attached patch changes the test to 
add timeout value. I'm running the test hundreds times on local. I'll report 
the result.

The following comments are observation from code and log.
1. ZK server startups correctly and its client fails to connect to server. We 
can observe it from the log, .
2. ZKRMStateStore is not called stop() method after testing, but its connection 
is cleaned up after testing in ClientBaseWithFixes#tearDown. IIUC, it works 
well.


 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-1281.1.patch, output.txt


 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently

2014-04-16 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1281:
-

Attachment: YARN-1281.2.patch

Updated a patch to change ZK-related timeouts correctly.

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-1281.1.patch, YARN-1281.2.patch, output.txt


 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently

2014-04-16 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1281:
--

Issue Type: Test  (was: Bug)

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Test
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-1281.1.patch, YARN-1281.2.patch, output.txt


 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently

2014-04-15 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1281:
-

Attachment: TestZKRMStateStureoSKClientConnections-output.txt

Attached complete log at the failure. ZMZKUtils#getZKAcls() fails to read ACLs. 
Maybe this is because of setup timing.

{quote}
2014-04-15 08:13:04,712 ERROR [Thread-12] resourcemanager.RMZKUtils 
(RMZKUtils.java:getZKAcls(51)) - Couldn't read ACLs based on 
yarn.resourcemanager.zk-acl
2014-04-15 08:13:04,713 INFO  [Thread-12] service.AbstractService 
(AbstractService.java:noteFailure(272)) - Service 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore failed in 
state INITED; cause: org.apache.hadoop.util.ZKUtil$BadAclFormatException: ACL 
'randomstring*' not of expected form scheme:id:perm
org.apache.hadoop.util.ZKUtil$BadAclFormatException: ACL 'randomstring*' not 
of expected form scheme:id:perm
at org.apache.hadoop.util.ZKUtil.parseACLs(ZKUtil.java:110)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMZKUtils.getZKAcls(RMZKUtils.java:49)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.initInternal(ZKRMStateStore.java:206)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.serviceInit(RMStateStore.java:276)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections$TestZKClient$TestZKRMStateStore.init(TestZKRMStateStoreZKClientConnections.java:79)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections$TestZKClient.getRMStateStore(TestZKRMStateStoreZKClientConnections.java:129)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections.testInvalidZKAclConfiguration(TestZKRMStateStoreZKClientConnections.java:261)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
{quote}

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
 Attachments: TestZKRMStateStureoSKClientConnections-output.txt


 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently

2014-04-15 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1281:
-

Attachment: (was: TestZKRMStateStureoSKClientConnections-output.txt)

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
 Attachments: output.txt


 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently

2014-04-15 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1281:
-

Attachment: output.txt

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
 Attachments: output.txt


 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently

2014-04-14 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1281:
---

Assignee: (was: Karthik Kambatla)

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla

 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)