[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986891#comment-16986891 ]
huhaiyang edited comment on HDFS-15024 at 12/3/19 1:26 PM: ----------------------------------------------------------- Current In RetryPolicies Implemented {code:java} /** * @return 0 if this is our first failover/retry (i.e., retry immediately), * sleep exponentially otherwise */ private long getFailoverOrRetrySleepTime(int times) { return times == 0 ? 0 : calculateExponentialTime(delayMillis, times, maxDelayBase); } {code} It is reasonable to consider the number of namenode as a condition for calculating sleep duration was (Author: haiyang hu): Current In RetryPolicies Implemented {code:java} /** * @return 0 if this is our first failover/retry (i.e., retry immediately), * sleep exponentially otherwise */ private long getFailoverOrRetrySleepTime(int times) { return times == 0 ? 0 : calculateExponentialTime(delayMillis, times, maxDelayBase); } {code} It is reasonable to consider the number of namenode as a condition for calculating sleep duration > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --------------------------------------------------------------------------------------------------------------- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 2.10.0, 3.3.0, 3.2.1 > Reporter: huhaiyang > Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > <property> > <name>dfs.ha.namenodes.ns1</name> > <value>nn2,nn3,nn1</value> > </property> > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org