[
https://issues.apache.org/jira/browse/HDFS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208161#comment-13208161
]
Hari Mankude commented on HDFS-2949:
------------------------------------
If -failover command can handle this situation and other situations correctly,
why not deprecate -transitiontoActive entirely?
> HA: Add check to active state transition to prevent operator-induced split
> brain
> --------------------------------------------------------------------------------
>
> Key: HDFS-2949
> URL: https://issues.apache.org/jira/browse/HDFS-2949
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ha, name-node
> Affects Versions: HA branch (HDFS-1623)
> Reporter: Todd Lipcon
>
> Currently, if the administrator mistakenly calls "-transitionToActive" on one
> NN while the other one is still active, all hell will break loose. We can add
> a simple check by having the NN make a getServiceState() RPC to its peer with
> a short (~1 second?) timeout. If the RPC succeeds and indicates the other
> node is active, it should refuse to enter active mode. If the RPC fails or
> indicates standby, it can proceed.
> This is just meant as a preventative safety check - we still expect users to
> use the "-failover" command which has other checks plus fencing built in.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira