[
https://issues.apache.org/jira/browse/HDFS-16514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
qinyuren updated HDFS-16514:
----------------------------
Description:
Recently, we used the [Standby Read] feature in our test cluster, and deployed
4 namenode as follow:
node1 -> active nn
node2 -> standby nn
node3 -> observer nn
node3 -> observer nn
If we set ’dfs.client.failover.random.order=true‘, the client may failover
twice and wait a long time to send msync to active namenode.
!image-2022-03-21-18-11-37-191.png|width=698,height=169!
I think we can reduce the sleep time of the first several failover based on the
number of namenode
For example, if 4 namenode are configured, the sleep time of first three
failover operations is set to zero.
was:
Recently, we used the [Standby Read] feature in our test cluster, and deployed
4 namenode as follow:
node1 -> active nn
node2 -> standby nn
node3 -> observer nn
node3 -> observer nn
If we set ’dfs.client.failover.random.order=true‘, the client may failover
twice and wait a long time to send msync to active namenode.
!image-2022-03-21-18-11-37-191.png|width=698,height=169!
I think we can reduce the sleep time of the first several failover based on the
number of namenode
For example, if 4 namenode are configured, the sleep time of first three
failover operations is zero.
> Reduce the failover sleep time if multiple namenode are configured
> ------------------------------------------------------------------
>
> Key: HDFS-16514
> URL: https://issues.apache.org/jira/browse/HDFS-16514
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: qinyuren
> Priority: Major
> Attachments: image-2022-03-21-18-11-37-191.png
>
>
> Recently, we used the [Standby Read] feature in our test cluster, and
> deployed 4 namenode as follow:
> node1 -> active nn
> node2 -> standby nn
> node3 -> observer nn
> node3 -> observer nn
> If we set ’dfs.client.failover.random.order=true‘, the client may failover
> twice and wait a long time to send msync to active namenode.
> !image-2022-03-21-18-11-37-191.png|width=698,height=169!
> I think we can reduce the sleep time of the first several failover based on
> the number of namenode
> For example, if 4 namenode are configured, the sleep time of first three
> failover operations is set to zero.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]