[
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401969#comment-13401969
]
Todd Lipcon commented on HDFS-3077:
-----------------------------------
bq. I want to understand how the changes can be reconciled with 3092. Currently
BackupNode is being updated to used the JournalService.
The journal interface as exposed by the quorum-capable Journal Node looks
different enough from the BackupNode that I don't see any merit to combining
the IPC protocols. It only muddies the interaction, IMO. For example, the
QJournalProtocol has the concept of a "journal ID" so that each JournalNode can
host journals for multiple namespaces at once, as well as the epoch concept
which makes no sense in a BackupNode scenario. If we wanted to extend HDFS to
act more like a true quorum-driven system (a la ZooKeeper) where each of the
nodes maintains a full namespace as equal peers, we'd need to do more work on
the commit protocol (eg adding an explicit "commit" RPC distinct from
"journal"). That kind of change hasn't been proposed anywhere that I'm aware
of, so I didn't want to complicate this design by considering it.
At this point I would advocate removing the BackupNode entirely, as I don't
know of a single person using it for the last ~2 years since it was introduced.
But, that's a separate discussion.
bq. Once this is done, we were planning to merge 3092 into trunk. How should we
proceed to merge 3077 and 3092 to trunk?
I used a bunch of the HDFS-3092 branch code and design in development of this
JIRA, so I would consider it to be "incorporated" into the 3077 branch already.
So, I would advocate abandoning the current 3092 branch as a stepping stone
(server-side-only) along the way to the full solution (server and client side
implementation). Of course I'll make sure that Brandon and Hari are given their
due credit as co-authors of this patch.
bq. Is code review going to be based off of this or code changes into a branch
on Apache Hadoop code base?
I posted the git branch just for reference, since some contributors find it
easier to do a git pull rather than manually apply the patches locally for
review. But the link above is to the exact same code I've attached to the JIRA.
Feel free to review by looking at the patch or at the branch. Would it be
helpful for me to make a branch in SVN and push the pre-review patch series
there for review instead of the external github? Let me know.
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
> Key: HDFS-3077
> URL: https://issues.apache.org/jira/browse/HDFS-3077
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: ha, name-node
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hdfs-3077-partial.txt, hdfs-3077.txt,
> qjournal-design.pdf, qjournal-design.pdf
>
>
> Currently, one of the weak points of the HA design is that it relies on
> shared storage such as an NFS filer for the shared edit log. One alternative
> that has been proposed is to depend on BookKeeper, a ZooKeeper subproject
> which provides a highly available replicated edit log on commodity hardware.
> This JIRA is to implement another alternative, based on a quorum commit
> protocol, integrated more tightly in HDFS and with the requirements driven
> only by HDFS's needs rather than more generic use cases. More details to
> follow.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira