[
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968901#comment-16968901
]
Xudong Cao commented on HDFS-14963:
-----------------------------------
[~weichiu] Within a same FileSystem instance it's right, but for example if a
new client process starts, it just begin rpc calls simply from the 1st nn.
> Add DFS Client caching active namenode mechanism.
> -------------------------------------------------
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Affects Versions: 3.1.3
> Reporter: Xudong Cao
> Assignee: Xudong Cao
> Priority: Minor
>
> In multi-NameNodes scenery, hdfs client always starts a rpc call toward the
> 1st namenode, simply polls, and finally determines the current Active
> namenode.
> This brings at least two problems:
> 1. Extra failover consumption, especially in the case of frequent startup of
> new client processes.
> 2. Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and
> then a client starts rpc with the 1st NN, it will be silent when failover
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd
> NN, it prints some unnecessary logs, in some scenarios, these logs will be
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
> Operation category READ is not supported in state standby. Visit
> https://s.apache.org/sbnn-error
> at
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
> at
> ...{code}
>
> We can introduce a solution to this problem: caching the current Active
> NameNode index in a local file on client side, so:
> 1. When a client starts, it reads the current Active NameNode index from the
> cache file and make an rpc call toward the right ANN.
> 2. After each time client failovers, it need to write the latest Active
> NameNode index to the cache file.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]