[
https://issues.apache.org/jira/browse/HDFS-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124360#comment-13124360
]
Justin Joseph commented on HDFS-2354:
-------------------------------------
bq. Even in this case, you could build a FailoverController that runs on the
same JVM, but interacts with Namenode using HAServiceProtocol. The approach
from HDFS-1974 does not preclude you from that.
Yes, in fact I am trying to get the right set of interfaces which will allow HA
enabled Namenodes to work with any type of Cluster Resource Managers / HA
Agents (Linux HA, Zk based Leader Election Service which may be either embedded
or external etc..).
I figured out the fundamental difference in our approaches. In your approach,
HA Service Protocol is the set of commands for triggering state changes on
Namenode, from Active to Standby or from Standby to Active. In my design, I
have modeled it as a Local Resource Manager which sits between an HA Agent (or
Cluster Resource Manager) and a Resource Agent (Namenode). I had to, since I
considered the case where both the HA Agent and Namenode runs together in the
same JVM. In this case, the Local Resource Manager will be responsible for
starting Namenode as either Active or Standby and then at a later point of
time, switching to a different role.
Taking a consolidated view of the patches HADOOP-7455, HDFS-1974 and HDFS-2301
& comparing with the design I have attached in this JIRA, I feel our approaches
are similar in many aspects. The implementation in HDFS-1974 has the drawback
that Namenode / NamenodeRpcServer implements HAServiceProtocol, which is
against the goal of HDFS-1623 to build a framework for HA.
The various considerations for HA Service Protocol from my perspective are the
following
1) It should be easy to plug in the various approaches for building High
Availability for Namenode (For example, one may chose from Backup Node based
approach or Shared Storage approach or any other)
2) It should be possible to work with any type of Cluster Resource Managers
3) It should be possible to add different states, in addition to Active and
Standby states.
I had tried to build a comprehensive framework which takes the above points
into consideration. I request to take a detailed look at the design and patch I
have submitted & share your feedback with specific details.
> Generalize the HAServiceProtocol interface
> ------------------------------------------
>
> Key: HDFS-2354
> URL: https://issues.apache.org/jira/browse/HDFS-2354
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: name-node
> Reporter: Justin Joseph
> Assignee: Justin Joseph
> Attachments: HAService_fw_Class_Diagram.JPG, HDFS-2354.1.patch,
> HDFS-2354.patch
>
>
> This JIRA intends to revisit the patches committed for HADOOP-7455 and
> HDFS-1974 & to provide more generic interfaces which allows alternative HA
> implementations to co-exist complying with HAServiceProtocol.
> Some of the considerations are
> 1) Support life cycle methods (start*() and stop() APIs) in HAServiceProtocol
> 2) Support custom states in HAServiceProtocol
> 3) As per the patch submitted for HDFS-1974, Namenode implements HAService
> interface. This needs to be reconsidered.
> I will elaborate on these points, in the form of comments below.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira