[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController#testGracefulFailoverMultipleZKfcs fails intermittently

2016-10-07 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
Description: {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} 
uses fixed time sleep before assertions. This may fail sometimes though 10 
seconds are generally long enough. I think we can use 
{{GenericTestUtils.waitFor()}} to retry the assertions.  (was: h6. Error Message
{quote}
Unable to become active. Local node did not get an opportunity to do so from 
ZooKeeper, or the local node took too long to transition to active.
{quote}
h6. Stacktrace
{quote}
org.apache.hadoop.ha.ServiceFailedException: Unable to become active. Local 
node did not get an opportunity to do so from ZooKeeper, or the local node took 
too long to transition to active.
at 
org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:690)
at 
org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:62)
at 
org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:607)
at 
org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:604)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1795)
at 
org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:604)
at 
org.apache.hadoop.ha.ZKFCRpcServer.gracefulFailover(ZKFCRpcServer.java:94)
at 
org.apache.hadoop.ha.TestZKFailoverController.testGracefulFailoverMultipleZKfcs(TestZKFailoverController.java:590)
{quote}

See recent failing build:
# 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10705/testReport/org.apache.hadoop.ha/TestZKFailoverController/testGracefulFailoverMultipleZKfcs/
# to add more

I think we can use {{GenericTestUtils.waitFor()}} to retry the assertions.)

> o.a.h.ha.TestZKFailoverController#testGracefulFailoverMultipleZKfcs fails 
> intermittently
> 
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10985.000.patch
>
>
> {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed 
> time sleep before assertions. This may fail sometimes though 10 seconds are 
> generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to 
> retry the assertions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController#testGracefulFailoverMultipleZKfcs fails intermittently

2016-10-07 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
Status: Patch Available  (was: Open)

> o.a.h.ha.TestZKFailoverController#testGracefulFailoverMultipleZKfcs fails 
> intermittently
> 
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10985.000.patch
>
>
> h6. Error Message
> {quote}
> Unable to become active. Local node did not get an opportunity to do so from 
> ZooKeeper, or the local node took too long to transition to active.
> {quote}
> h6. Stacktrace
> {quote}
> org.apache.hadoop.ha.ServiceFailedException: Unable to become active. Local 
> node did not get an opportunity to do so from ZooKeeper, or the local node 
> took too long to transition to active.
>   at 
> org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:690)
>   at 
> org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:62)
>   at 
> org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:607)
>   at 
> org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:604)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1795)
>   at 
> org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:604)
>   at 
> org.apache.hadoop.ha.ZKFCRpcServer.gracefulFailover(ZKFCRpcServer.java:94)
>   at 
> org.apache.hadoop.ha.TestZKFailoverController.testGracefulFailoverMultipleZKfcs(TestZKFailoverController.java:590)
> {quote}
> See recent failing build:
> # 
> https://builds.apache.org/job/PreCommit-HADOOP-Build/10705/testReport/org.apache.hadoop.ha/TestZKFailoverController/testGracefulFailoverMultipleZKfcs/
> # to add more
> I think we can use {{GenericTestUtils.waitFor()}} to retry the assertions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController#testGracefulFailoverMultipleZKfcs fails intermittently

2016-10-07 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
Attachment: HDFS-10985.000.patch

[~ste...@apache.org] and [~atm], does this makes sense to you?

> o.a.h.ha.TestZKFailoverController#testGracefulFailoverMultipleZKfcs fails 
> intermittently
> 
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10985.000.patch
>
>
> h6. Error Message
> {quote}
> Unable to become active. Local node did not get an opportunity to do so from 
> ZooKeeper, or the local node took too long to transition to active.
> {quote}
> h6. Stacktrace
> {quote}
> org.apache.hadoop.ha.ServiceFailedException: Unable to become active. Local 
> node did not get an opportunity to do so from ZooKeeper, or the local node 
> took too long to transition to active.
>   at 
> org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:690)
>   at 
> org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:62)
>   at 
> org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:607)
>   at 
> org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:604)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1795)
>   at 
> org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:604)
>   at 
> org.apache.hadoop.ha.ZKFCRpcServer.gracefulFailover(ZKFCRpcServer.java:94)
>   at 
> org.apache.hadoop.ha.TestZKFailoverController.testGracefulFailoverMultipleZKfcs(TestZKFailoverController.java:590)
> {quote}
> See recent failing build:
> # 
> https://builds.apache.org/job/PreCommit-HADOOP-Build/10705/testReport/org.apache.hadoop.ha/TestZKFailoverController/testGracefulFailoverMultipleZKfcs/
> # to add more
> I think we can use {{GenericTestUtils.waitFor()}} to retry the assertions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController#testGracefulFailoverMultipleZKfcs fails intermittently

2016-10-07 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10985:
-
Component/s: test

> o.a.h.ha.TestZKFailoverController#testGracefulFailoverMultipleZKfcs fails 
> intermittently
> 
>
> Key: HDFS-10985
> URL: https://issues.apache.org/jira/browse/HDFS-10985
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>
> h6. Error Message
> {quote}
> Unable to become active. Local node did not get an opportunity to do so from 
> ZooKeeper, or the local node took too long to transition to active.
> {quote}
> h6. Stacktrace
> {quote}
> org.apache.hadoop.ha.ServiceFailedException: Unable to become active. Local 
> node did not get an opportunity to do so from ZooKeeper, or the local node 
> took too long to transition to active.
>   at 
> org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:690)
>   at 
> org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:62)
>   at 
> org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:607)
>   at 
> org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:604)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1795)
>   at 
> org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:604)
>   at 
> org.apache.hadoop.ha.ZKFCRpcServer.gracefulFailover(ZKFCRpcServer.java:94)
>   at 
> org.apache.hadoop.ha.TestZKFailoverController.testGracefulFailoverMultipleZKfcs(TestZKFailoverController.java:590)
> {quote}
> See recent failing build:
> # 
> https://builds.apache.org/job/PreCommit-HADOOP-Build/10705/testReport/org.apache.hadoop.ha/TestZKFailoverController/testGracefulFailoverMultipleZKfcs/
> # to add more
> I think we can use {{GenericTestUtils.waitFor()}} to retry the assertions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org