[ 
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403363#comment-13403363
 ] 

Todd Lipcon commented on HDFS-3077:
-----------------------------------

bq. Quorum is a semantics on the client/writer side and not the server side 
policy. Hence the protocol for journaling should be generic enough. So lets not 
call it QJournalProtocol and make it generic, allowing other types of 
clients/writers.

I disagree with this statement. The commit protocol is strongly intertwined 
with the way in which the server has to behave. For example, the "new epoch" 
command needs to provide back certain information about the current state of 
the journals and previous paxos-style 'accepted' decisions. Trying to shoehorn 
it into a generic protocol doesn't make much sense to me.

bq. If you see some functionality missing in 3092, lets discuss and add it 
there, instead of copying code and changing it separately

3092's "log syncing" stuff doesn't fit with the recovery protocol needed for 
correct operation in a quorum commit setting. 3092's method of the JNs 
"registering" with the NN doesn't make sense either in this system, since group 
membership changes are not yet designed for and are quite complex. So it's not 
a matter of adding functionality to 3092, it's a matter of removing a lot of 
the functionality which just doesn't fit with this commit protocol.

bq. Also 3092 has been in development in open, in incremental fashion. I think 
we should follow this, instead of attaching a big patch from github.

I made a best effort to do it in the open and incrementally, but didn't get any 
responses from the community. See HDFS-3188 and HDFS-3189 for example, both of 
which I posted back in April. I remember in the same discussions you referenced 
above that you said you'd take a look at these in the spirit of incremental 
progress. I understand you got busy with other things, but I wasn't going to 
stop working on the project in the meantime. So, work progressed and now 
there's a more fully baked implementation here.

Don't be fooled by the big size of the patch - the majority of the lines of 
code are essentially boiler-plate -- protobuf translators, simple code to 
start/stop RPC and HTTP servers, etc. I don't think this is unreasonably large 
to review.
                
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, hdfs-3077.txt, 
> qjournal-design.pdf, qjournal-design.pdf
>
>
> Currently, one of the weak points of the HA design is that it relies on 
> shared storage such as an NFS filer for the shared edit log. One alternative 
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject 
> which provides a highly available replicated edit log on commodity hardware. 
> This JIRA is to implement another alternative, based on a quorum commit 
> protocol, integrated more tightly in HDFS and with the requirements driven 
> only by HDFS's needs rather than more generic use cases. More details to 
> follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to