[
https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244341#comment-13244341
]
Todd Lipcon commented on HDFS-3092:
-----------------------------------
Hi Suresh. I took a look at the design document, and I think it actually shares
a lot with what I'm doing in HDFS-3077. Hopefully we can share some portions of
the code and design.
Here are some points I think need elaboration in the design doc:
- How does the fencing command ensure that prior NNs can no longer access the
JD after it completes? I think you need to have the JDs record the sequence
number of each NN so they can reject past NNs from coming back to life.
- I don't think the following can be done correctly:
{quote}
a. Choose the transaction >= Q JDs from JournalList.
b. Else choose the highest transaction ID from a JD in the JournalList.
3. All the JDs perform recovery to transaction ID Ft .
{quote}
because it's possible that there are distinct transactions with the same txid
at the beginning of a log segment. For example, consider the following
situation with two NN (NN1 and NN2) and three JDs (JD1, JD2, JD3):
1. NN1 writes txid 1 to JD1, crashes before writing to JD2 and JD3
2. NN2 initiates fencing, but only succeeds in contacting JD2 and JD3. So, it
does not see the edit made in step 1
3. NN2 writes txid1 to JD2 and JD3, then crashes
4. One of the two NNs recovers. It sees that all JDs have txid 1. The
fencing/synchronization process you've described cannot distinguish between the
correct txn which was written to a quorum and the incorrect txn which was only
written to JD1.
> Enable journal protocol based editlog streaming for standby namenode
> --------------------------------------------------------------------
>
> Key: HDFS-3092
> URL: https://issues.apache.org/jira/browse/HDFS-3092
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: ha, name-node
> Affects Versions: 0.24.0, 0.23.3
> Reporter: Suresh Srinivas
> Assignee: Suresh Srinivas
> Attachments: MultipleSharedJournals.pdf
>
>
> Currently standby namenode relies on reading shared editlogs to stay current
> with the active namenode, for namespace changes. BackupNode used streaming
> edits from active namenode for doing the same. This jira is to explore using
> journal protocol based editlog streams for the standby namenode. A daemon in
> standby will get the editlogs from the active and write it to local edits. To
> begin with, the existing standby mechanism of reading from a file, will
> continue to be used, instead of from shared edits, from the local edits.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira