[ 
https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244341#comment-13244341
 ] 

Todd Lipcon commented on HDFS-3092:
-----------------------------------

Hi Suresh. I took a look at the design document, and I think it actually shares 
a lot with what I'm doing in HDFS-3077. Hopefully we can share some portions of 
the code and design.

Here are some points I think need elaboration in the design doc:
- How does the fencing command ensure that prior NNs can no longer access the 
JD after it completes? I think you need to have the JDs record the sequence 
number of each NN so they can reject past NNs from coming back to life.
- I don't think the following can be done correctly:
{quote}
a. Choose the transaction >= Q JDs from JournalList.
b. Else choose the highest transaction ID from a JD in the JournalList.
3. All the JDs perform recovery to transaction ID Ft .
{quote}
because it's possible that there are distinct transactions with the same txid 
at the beginning of a log segment. For example, consider the following 
situation with two NN (NN1 and NN2) and three JDs (JD1, JD2, JD3):

1. NN1 writes txid 1 to JD1, crashes before writing to JD2 and JD3 
2. NN2 initiates fencing, but only succeeds in contacting JD2 and JD3. So, it 
does not see the edit made in step 1
3. NN2 writes txid1 to JD2 and JD3, then crashes
4. One of the two NNs recovers. It sees that all JDs have txid 1. The 
fencing/synchronization process you've described cannot distinguish between the 
correct txn which was written to a quorum and the incorrect txn which was only 
written to JD1.
                
> Enable journal protocol based editlog streaming for standby namenode
> --------------------------------------------------------------------
>
>                 Key: HDFS-3092
>                 URL: https://issues.apache.org/jira/browse/HDFS-3092
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: ha, name-node
>    Affects Versions: 0.24.0, 0.23.3
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: MultipleSharedJournals.pdf
>
>
> Currently standby namenode relies on reading shared editlogs to stay current 
> with the active namenode, for namespace changes. BackupNode used streaming 
> edits from active namenode for doing the same. This jira is to explore using 
> journal protocol based editlog streams for the standby namenode. A daemon in 
> standby will get the editlogs from the active and write it to local edits. To 
> begin with, the existing standby mechanism of reading from a file, will 
> continue to be used, instead of from shared edits, from the local edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to