[
https://issues.apache.org/jira/browse/HDFS-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209201#comment-13209201
]
Jitendra Nath Pandey commented on HDFS-2914:
--------------------------------------------
bq. Even if the check will necessarily be performed within 5 seconds of
becoming active, we might as well run the check during the process of starting
active services.
I was thinking about it a bit, it might get tricky to check for resources when
starting active services, because at this point the namenode is still in
standby. If it enters safe mode, then if there is any failure in transition we
should take care to transition it back to non-safe mode. I am also suspicious
that if it transitions to safemode, some active services may not start just
because of the safemode, and that would mean loss of service. We cannot throw
an exception either, if resources are low, for the same reason.
I am leaning towards separating the two failure (low resources is not a
failure though) scenarios, i.e. standby transitions to active irrespective of
what its resource status is, and the check for resources is done independently
once transition to active is successfully completed. This is consistent with
the fact that low resources is not a failure, the cluster is still available in
read only mode.
> HA: Standby should not enter safemode when resources are low
> ------------------------------------------------------------
>
> Key: HDFS-2914
> URL: https://issues.apache.org/jira/browse/HDFS-2914
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ha, name-node
> Affects Versions: HA branch (HDFS-1623)
> Reporter: Hari Mankude
> Assignee: Hari Mankude
> Attachments: HDFS-2914-HDFS-1623, HDFS-2914-HDFS-1623,
> HDFS-2914-HDFS-1623.patch, hdfs-2914
>
>
> When shared edits dir is bounced, standby NN is put into safemode by the
> NameNodeResourceMonitor(). However, there is no path for it to exit out of
> safe mode when shared edits dir reappears.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira