[
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403363#comment-13403363
]
Todd Lipcon commented on HDFS-3077:
-----------------------------------
bq. Quorum is a semantics on the client/writer side and not the server side
policy. Hence the protocol for journaling should be generic enough. So lets not
call it QJournalProtocol and make it generic, allowing other types of
clients/writers.
I disagree with this statement. The commit protocol is strongly intertwined
with the way in which the server has to behave. For example, the "new epoch"
command needs to provide back certain information about the current state of
the journals and previous paxos-style 'accepted' decisions. Trying to shoehorn
it into a generic protocol doesn't make much sense to me.
bq. If you see some functionality missing in 3092, lets discuss and add it
there, instead of copying code and changing it separately
3092's "log syncing" stuff doesn't fit with the recovery protocol needed for
correct operation in a quorum commit setting. 3092's method of the JNs
"registering" with the NN doesn't make sense either in this system, since group
membership changes are not yet designed for and are quite complex. So it's not
a matter of adding functionality to 3092, it's a matter of removing a lot of
the functionality which just doesn't fit with this commit protocol.
bq. Also 3092 has been in development in open, in incremental fashion. I think
we should follow this, instead of attaching a big patch from github.
I made a best effort to do it in the open and incrementally, but didn't get any
responses from the community. See HDFS-3188 and HDFS-3189 for example, both of
which I posted back in April. I remember in the same discussions you referenced
above that you said you'd take a look at these in the spirit of incremental
progress. I understand you got busy with other things, but I wasn't going to
stop working on the project in the meantime. So, work progressed and now
there's a more fully baked implementation here.
Don't be fooled by the big size of the patch - the majority of the lines of
code are essentially boiler-plate -- protobuf translators, simple code to
start/stop RPC and HTTP servers, etc. I don't think this is unreasonably large
to review.
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
> Key: HDFS-3077
> URL: https://issues.apache.org/jira/browse/HDFS-3077
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: ha, name-node
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hdfs-3077-partial.txt, hdfs-3077.txt,
> qjournal-design.pdf, qjournal-design.pdf
>
>
> Currently, one of the weak points of the HA design is that it relies on
> shared storage such as an NFS filer for the shared edit log. One alternative
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject
> which provides a highly available replicated edit log on commodity hardware.
> This JIRA is to implement another alternative, based on a quorum commit
> protocol, integrated more tightly in HDFS and with the requirements driven
> only by HDFS's needs rather than more generic use cases. More details to
> follow.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira