[ 
https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221370#comment-13221370
 ] 

Bikas Saha commented on HDFS-2185:
----------------------------------

I have attached a state diagram for some ideas I had on how this could work. 
Think of the rectangles as the primary states of the controller. The ovals are 
actions that need to be taken before changing states. The black arrows are 
results of those actions and the blue arrows are external events. The blue 
arrows are notifications that can be received from the ZK leader election 
library added in HADOOP-7992 and the health notifications from the 
HAServiceProtocol.
This expects one change in the HAServiceProtocol. That is to split 
becomeActive() into prepareToBecomeActive() and becomeActive(). 
prepareToBecomeActive() does the time consuming heavy lifting and the world 
might change by the time it completes. At that point, if the node is still the 
leader, it can quickly becomeActive(). Else it can becomeStandby().
                
> HA: ZK-based FailoverController
> -------------------------------
>
>                 Key: HDFS-2185
>                 URL: https://issues.apache.org/jira/browse/HDFS-2185
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Eli Collins
>            Assignee: Todd Lipcon
>         Attachments: Failover_Controller.jpg
>
>
> This jira is for a ZK-based FailoverController daemon. The FailoverController 
> is a separate daemon from the NN that does the following:
> * Initiates leader election (via ZK) when necessary
> * Performs health monitoring (aka failure detection)
> * Performs fail-over (standby to active and active to standby transitions)
> * Heartbeats to ensure the liveness
> It should have the same/similar interface as the Linux HA RM to aid 
> pluggability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to