[jira] [Updated] (HDFS-1979) HDFS-1073: Fix backupnode for new edits/image layout

Todd Lipcon (JIRA) Wed, 06 Jul 2011 14:53:43 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Todd Lipcon updated HDFS-1979:
------------------------------

    Attachment: hdfs-1979.txt

Cleaned up the patch. I think this should be ready to go. Here's a summary of 
some of the changes to make the rather large patch easier to follow:

BackupNode itself:
- no longer uses the "spool" file. Instead, the state tracks whether the BN is 
"in sync" or "journaling only". In essence, the next log segment is used as the 
spool file.
- lots of refactoring so that checkpoint code is primarily shared with the 
SecondaryNameNode. We could pull this into a new CheckpointUtils class or 
something, but didn't want to make that change here since it would make the 
patch even larger.
- moved the BN-specific RPCs into a new BackupNodeProtocol instead of sharing 
NameNodeProtocol. This makes sense since the NN as is was just throwing 
exceptions on those calls.
- split the BN RPCs into several pieces, rather than using journal() for 
everything. This makes the API easier to follow
- fixed bugs where the NN would send uncheckpointed txns to the BackupNode (BN 
is currently non-functional in trunk)

EditLog:
- added new BackupJournalManager to coordinate talking to BN
- added new parameter to start/end log segment about whether to include the 
special START/END transactions. This was necessary since the BN will receive 
these replicated from the NN, and thus shouldn't add its own in addition to 
what the NN wrote.

BackupImage/FSImage:
- new concept of "lastAppliedTxId" which tracks the latest txnid reflected by 
the namesystem. Some refactoring done so that this is properly tracked during 
image loading, etc. We used to simply use the edit log's "last written txid" 
for this, but in the case of the BN the edit log may be writing ahead of where 
the NS actually reflects.

Storage inspector:
- refactored out the planning of loading logs from which image. This will 
probably get changed again by the work in HDFS-1579, but this was the minimal 
change to get this working. Used when the BN is synchronizing with the NN.

Tests:
- added new test for the BN that makes sure it can stay in sync with the NN, 
replicates edits identically, etc.
- split CN test and BN tests into separate methods to be easier to run just one
- removed testBackupRegistration since we no longer have to enforce 
only-one-backupnode

> HDFS-1073: Fix backupnode for new edits/image layout
> ----------------------------------------------------
>
>                 Key: HDFS-1979
>                 URL: https://issues.apache.org/jira/browse/HDFS-1979
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-1979-prelim.txt, hdfs-1979.txt
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1979) HDFS-1073: Fix backupnode for new edits/image layout

Reply via email to