HA: Add check to active state transition to prevent operator-induced split brain
--------------------------------------------------------------------------------
Key: HDFS-2949
URL: https://issues.apache.org/jira/browse/HDFS-2949
Project: Hadoop HDFS
Issue Type: Sub-task
Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Todd Lipcon
Currently, if the administrator mistakenly calls "-transitionToActive" on one
NN while the other one is still active, all hell will break loose. We can add a
simple check by having the NN make a getServiceState() RPC to its peer with a
short (~1 second?) timeout. If the RPC succeeds and indicates the other node is
active, it should refuse to enter active mode. If the RPC fails or indicates
standby, it can proceed.
This is just meant as a preventative safety check - we still expect users to
use the "-failover" command which has other checks plus fencing built in.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira