[jira] [Commented] (HDFS-14963) Add DFS Client caching active namenode mechanism.

Xudong Cao (Jira) Wed, 06 Nov 2019 19:49:50 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968901#comment-16968901
 ]


Xudong Cao commented on HDFS-14963:
-----------------------------------

[~weichiu] Within a same FileSystem instance it's right, but for example if a 
new client process starts,  it just begin rpc calls simply from the 1st nn.

> Add DFS Client caching active namenode mechanism.
> -------------------------------------------------
>
>                 Key: HDFS-14963
>                 URL: https://issues.apache.org/jira/browse/HDFS-14963
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.1.3
>            Reporter: Xudong Cao
>            Assignee: Xudong Cao
>            Priority: Minor
>
> In multi-NameNodes scenery, hdfs client always starts a rpc call toward the 
> 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems：
> 1. Extra failover consumption, especially in the case of frequent startup of 
> new client processes.
> 2. Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  at 
>  ...{code}
>  
> We can introduce a solution to this problem: caching the current Active 
> NameNode index in a local file on client side, so:
> 1. When a client starts, it reads the current Active NameNode index from the 
> cache file and make an rpc call toward the right ANN.
> 2. After each time client failovers, it need to write the latest Active 
> NameNode index to the cache file.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-14963) Add DFS Client caching active namenode mechanism.

Reply via email to