[ 
https://issues.apache.org/jira/browse/HDFS-16514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qinyuren updated HDFS-16514:
----------------------------
    Description: 
Recently, we used the [Standby Read] feature in our test cluster, and deployed 
4 namenode as follow:
node1 -> active nn
node2 -> standby nn
node3 -> observer nn
node3 -> observer nn

If we set ’dfs.client.failover.random.order=true‘, the client may failover 
twice and wait a long time to send msync to active namenode. 

!image-2022-03-21-18-11-37-191.png|width=698,height=169!

I think we can reduce the sleep time of the first several failover based on the 
number of namenode

For example, if 4 namenode are configured, the sleep time of first three 
failover operations is zero.

  was:
Recently, we used the [Standby Read] feature in our test cluster, and deployed 
4 namenode as follow:
node1 -> active nn
node2 -> standby nn
node3 -> observer nn
node3 -> observer nn

If we set ’dfs.client.failover.random.order=true‘, the client may failover 
twice and wait a long time to send msync to active namenode. 

!image-2022-03-21-18-11-37-191.png|width=698,height=169!

I think we can reduce the sleep time of the first several failover based on the 
number of namenode


> Reduce the failover sleep time if multiple namenode are configured
> ------------------------------------------------------------------
>
>                 Key: HDFS-16514
>                 URL: https://issues.apache.org/jira/browse/HDFS-16514
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: qinyuren
>            Priority: Major
>         Attachments: image-2022-03-21-18-11-37-191.png
>
>
> Recently, we used the [Standby Read] feature in our test cluster, and 
> deployed 4 namenode as follow:
> node1 -> active nn
> node2 -> standby nn
> node3 -> observer nn
> node3 -> observer nn
> If we set ’dfs.client.failover.random.order=true‘, the client may failover 
> twice and wait a long time to send msync to active namenode. 
> !image-2022-03-21-18-11-37-191.png|width=698,height=169!
> I think we can reduce the sleep time of the first several failover based on 
> the number of namenode
> For example, if 4 namenode are configured, the sleep time of first three 
> failover operations is zero.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to