[
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071388#comment-15071388
]
Masatake Iwasaki commented on HDFS-9376:
----------------------------------------
The failover thread in {{HAStressTestHarness}} will invoke failover
periodically with fixed sleep time. The {{msBetweenFailovers}} is set to 1000
ms for {{TestSeveralNameNodes}}.
{code}
for (int i = 0; i < nns; i++) {
int next = (i + 1) % nns;
...
cluster.transitionToStandby(i);
cluster.transitionToActive(next);
...
Thread.sleep(msBetweenFailovers);
{code}
Retry proxy of client have sleep time exponential to number of retries on
failover. The client is possible to sleep up to around 15 seconds if it
repeatedly fails on the operation. The client may not get enough effective run
time due to this.
{noformat}
2015-12-24 12:22:00,784 [Thread-250] INFO retry.RetryInvocationHandler
(RetryInvocationHandler.java:invoke(147)) - Exception while invoking create of
class ClientNamenodeProtocolTranslatorPB over localhost/127.0.0.1:42201 after 4
fail over attempts. Trying to fail over after sleeping for 10161ms.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category WRITE is not supported in state standby. Visit
https://s.apache.org/sbnn-error
{noformat}
> TestSeveralNameNodes fails occasionally
> ---------------------------------------
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Assignee: Masatake Iwasaki
>
> TestSeveralNameNodes has been failing in precommit builds. It usually times
> out on waiting for the last thread to finish writing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)