[
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777427#comment-16777427
]
Xiao Liang commented on HDFS-14201:
-----------------------------------
Thanks [~hexiaoqiao] for pointing out the manual transition, it's a valid
point. Actually I think combining the logic in [^HDFS-14201.002.patch] and
[^HDFS-14201.003.patch] could be an option, so that when the switch for this
feature is on:
# in auto-failover mode, ZKFC choose a ready-to-serve NameNode to become
active, as those in safemode ones report UNHEALTHY;
# in manual mode, NameNode in safemode will not be able to transit to active;
The same configuration item would be controlling these logic to be on/off. How
do you think [~hexiaoqiao]? I would upload a new patch as proposed if you think
it's a reasonable option.
> Ability to disallow safemode NN to become active
> ------------------------------------------------
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: auto-failover
> Affects Versions: 3.1.1, 2.9.2
> Reporter: Xiao Liang
> Assignee: Xiao Liang
> Priority: Major
> Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch,
> HDFS-14201.003.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active,
> for availability of both read and write, Namenodes not in safemode are better
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of
> safemode, especially when there are large number of files and blocks in HDFS,
> that means if a Namenode in safemode become active, the cluster will be not
> fully functioning for quite a while, even if it can while there is some
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning
> Namenode to become active, improving the general availability of the cluster.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]