[
https://issues.apache.org/jira/browse/HDFS-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HDFS-1979:
------------------------------
Attachment: hdfs-1979.txt
Cleaned up the patch. I think this should be ready to go. Here's a summary of
some of the changes to make the rather large patch easier to follow:
BackupNode itself:
- no longer uses the "spool" file. Instead, the state tracks whether the BN is
"in sync" or "journaling only". In essence, the next log segment is used as the
spool file.
- lots of refactoring so that checkpoint code is primarily shared with the
SecondaryNameNode. We could pull this into a new CheckpointUtils class or
something, but didn't want to make that change here since it would make the
patch even larger.
- moved the BN-specific RPCs into a new BackupNodeProtocol instead of sharing
NameNodeProtocol. This makes sense since the NN as is was just throwing
exceptions on those calls.
- split the BN RPCs into several pieces, rather than using journal() for
everything. This makes the API easier to follow
- fixed bugs where the NN would send uncheckpointed txns to the BackupNode (BN
is currently non-functional in trunk)
EditLog:
- added new BackupJournalManager to coordinate talking to BN
- added new parameter to start/end log segment about whether to include the
special START/END transactions. This was necessary since the BN will receive
these replicated from the NN, and thus shouldn't add its own in addition to
what the NN wrote.
BackupImage/FSImage:
- new concept of "lastAppliedTxId" which tracks the latest txnid reflected by
the namesystem. Some refactoring done so that this is properly tracked during
image loading, etc. We used to simply use the edit log's "last written txid"
for this, but in the case of the BN the edit log may be writing ahead of where
the NS actually reflects.
Storage inspector:
- refactored out the planning of loading logs from which image. This will
probably get changed again by the work in HDFS-1579, but this was the minimal
change to get this working. Used when the BN is synchronizing with the NN.
Tests:
- added new test for the BN that makes sure it can stay in sync with the NN,
replicates edits identically, etc.
- split CN test and BN tests into separate methods to be easier to run just one
- removed testBackupRegistration since we no longer have to enforce
only-one-backupnode
> HDFS-1073: Fix backupnode for new edits/image layout
> ----------------------------------------------------
>
> Key: HDFS-1979
> URL: https://issues.apache.org/jira/browse/HDFS-1979
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hdfs-1979-prelim.txt, hdfs-1979.txt
>
>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira