[
https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237367#comment-13237367
]
Todd Lipcon commented on HDFS-2185:
-----------------------------------
Hi Bikas. The important bits of the code are only ~200 lines. Is there really
much value in a detailed design doc? In my opinion, if the code itself isn't
clear and self-documenting enough to make the design obvious, then the code
needs to be better. If there's anything unclear in the code, please let me know
and I'll improve the javadocs and inline comments. A general overview of the
design is posted above, though the code has less of a formal state machine
approach.
> HA: ZK-based FailoverController
> -------------------------------
>
> Key: HDFS-2185
> URL: https://issues.apache.org/jira/browse/HDFS-2185
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA branch (HDFS-1623)
> Reporter: Eli Collins
> Assignee: Todd Lipcon
> Attachments: Failover_Controller.jpg, hdfs-2185.txt
>
>
> This jira is for a ZK-based FailoverController daemon. The FailoverController
> is a separate daemon from the NN that does the following:
> * Initiates leader election (via ZK) when necessary
> * Performs health monitoring (aka failure detection)
> * Performs fail-over (standby to active and active to standby transitions)
> * Heartbeats to ensure the liveness
> It should have the same/similar interface as the Linux HA RM to aid
> pluggability.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira