[ 
https://issues.apache.org/jira/browse/HDFS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208231#comment-13208231
 ] 

Uma Maheswara Rao G commented on HDFS-2949:
-------------------------------------------

{quote}
That said, having the safety check described in this JIRA is still valuable, 
{quote}
Agreed with this point to add safety checks. But anyway this can not solve 100% 
split barain scenarios right? (ex: small network breakage between active and 
standby and admin accidentally executed -transitiontoActive on standby.) I 
think this will be addressed in future as part of Automatic failover and shared 
storage fencing. But when admins deals directly with command line for some 
maintanence purpose, this case may occur right?
Also for the apis transitionTo*, do we need to take the confirmation from the 
user before actually transitioning? this may give some more attention to the 
admin for proceeding.
                
> HA: Add check to active state transition to prevent operator-induced split 
> brain
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-2949
>                 URL: https://issues.apache.org/jira/browse/HDFS-2949
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Todd Lipcon
>
> Currently, if the administrator mistakenly calls "-transitionToActive" on one 
> NN while the other one is still active, all hell will break loose. We can add 
> a simple check by having the NN make a getServiceState() RPC to its peer with 
> a short (~1 second?) timeout. If the RPC succeeds and indicates the other 
> node is active, it should refuse to enter active mode. If the RPC fails or 
> indicates standby, it can proceed.
> This is just meant as a preventative safety check - we still expect users to 
> use the "-failover" command which has other checks plus fencing built in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to