GeorgeJahad opened a new pull request, #3980:
URL: https://github.com/apache/ozone/pull/3980

   
   ## What changes were proposed in this pull request?
   
   This PR updates the om follower bootstrap mechanism to include the snapshot 
state.
   
   It considers snapshot state to be all files under  
*metadataDir/db.snapshots*.
   
   That includes the om snapshot directories, as well as the snapshot diff 
compaction logs and backup sst files, (which have been moved to the 
"db.snapshots" dir by this PR.)
   
   This PR adds the contents db.snapshots dir to the tarball sent to the 
follower.  To reduce the size of the tarball, it does not include multiple 
copies of any hard links found.  Instead, it includes a list of hard links to 
be generated by the follower.
   
   Design doc here: 
https://docs.google.com/document/d/1cFZj-7NRxiHaZ56ndcf1Z1EqapPFy4fo4dDVIy_aCx4/edit
   
   ### Recon
   Recon also uses the same tarball to initialize its copy of the OM rocksdb.  
Since it doesn't need the snapshot data, I've added the "includeSnapshotData" 
parameter to the http request.
   
   ### Renamed OzoneManagerSnapshotProvider
   The ratis code uses the term "snapshot" to mean something other than what we 
mean.  It uses "snapshot" to refer to the tarball as a whole, (which now 
includes all of the individual "OM snapshots".)
   
   In particular, this class, in the "om/snapshot" directory is ambiguously 
named:
   ```
   org/apache/hadoop/ozone/om/snapshot/OzoneManagerSnapshotProvider.java
   ```
   To reduce potential confusion, I've renamed it to:
   ```
   org/apache/hadoop/ozone/om/ratis_snapshot/OmRatisSnapshotProvider.java
   ```
   
   
   ### Internal Consistency of Tarball
   There are two areas of consistency I've thought about:
   
   ##### Snapshot Info Table Entries -> Snapshot Directories
   There needs to be a directory for each snapshot info table entry.  These 
directories sometimes appear a short while after the snapshot info table entry 
is created.
   
   This PR addresses that by ensuring the directories exist before creating the 
tarball, (pausing for a few seconds if needed.)
   
   ##### Compaction Logs -> SST Files
    If the tarball is created during compaction, the snap diff compaction logs 
for the most recent compaction may not be included.  I'm not sure how bad a 
problem this is.  Please consider it in your review of this PR.
   
   ### Incremental checkpointing
   The addition of snapshot data will increase the size of the tarball, 
exacerbating the problem described here: 
https://issues.apache.org/jira/browse/HDDS-6510
   
   We'll need to decide if incremental checkpointing needs to a part of the 
initial snapshot release.  If not, it may need to come soon afterwards, 
otherwise users could be stranded.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6961
   
   ## How was this patch tested?
   
   Unit/integration tests added
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to