swamirishi commented on code in PR #9132:
URL: https://github.com/apache/ozone/pull/9132#discussion_r2578887367
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RocksDBCheckpoint.java:
##########
@@ -76,4 +81,13 @@ public void cleanupCheckpoint() throws IOException {
checkpointLocation.toString());
FileUtils.deleteDirectory(checkpointLocation.toFile());
}
+
+ @Override
+ public Path getCheckpointDataDir() throws IOException {
+ Path parent = checkpointLocation.getParent();
+ if (parent == null) {
+ throw new IOException("Checkpoint parent path is null");
+ }
+ return Paths.get(parent.toString(), OM_CHECKPOINT_DATA_DIR);
Review Comment:
This function is not required. Please create a new implementation of
DBCheckpoint maybe InodeMetadataCheckpoint which would be responsible for
holding inodeFiles and creating hardlinks. So downloadSnapshotFromLeader
function would return a InodeMetadaCheckpoint which already has the tarball
extracted and has corresponding directories(relative to the
metadataDirectories) which it has to move.
Using RocksdbCheckpoint doesn't make sense anymore. We can avoid adding
unnecessary functions
Let use remove the createHardlinks static function it doesn't make sense to
have that as a Util function which is so specific to the bootstrap
implementation we would never use it anywhere.
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -4046,36 +4048,43 @@ public synchronized TermIndex
installSnapshotFromLeader(String leaderId) {
try {
// Install hard links.
OmSnapshotUtils.createHardLinks(omDBCheckpoint.getCheckpointLocation(),
false);
- termIndex = installCheckpoint(leaderId, omDBCheckpoint);
+ Path checkpointLocation = omDBCheckpoint.getCheckpointLocation();
+ if (checkpointLocation == null) {
+ throw new IOException("Cannot install checkpoint from leader " +
leaderId + ": checkpointLocation is null");
+ }
+ Path checkpointDataDir = omDBCheckpoint.getCheckpointDataDir();
+ termIndex = installCheckpoint(leaderId, checkpointDataDir);
} catch (Exception ex) {
LOG.error("Failed to install snapshot from Leader OM.", ex);
}
return termIndex;
}
/**
- * Install checkpoint. If the checkpoint snapshot index is greater than
+ * Install checkpoint obtained from leader using the dbCheckpoint endpoint.
+ * The unpacked directory after installing hardlinks would contain a
+ * checkpoint_data dir that comprises both active OM DB dir and the
+ * db.snapshots directory.
+ * If the checkpoint snapshot index is greater than
* OM's last applied transaction index, then re-initialize the OM
* state via this checkpoint. Before re-initializing OM state, the OM Ratis
- * server should be stopped so that no new transactions can be applied.
+ * server should be stopped so that no new transactions can be applied
*/
- TermIndex installCheckpoint(String leaderId, DBCheckpoint omDBCheckpoint)
+ TermIndex installCheckpoint(String leaderId, Path checkpointDataDir)
throws Exception {
-
- Path checkpointLocation = omDBCheckpoint.getCheckpointLocation();
+ Path omDbPath = Paths.get(checkpointDataDir.toString(), OM_DB_NAME);
TransactionInfo checkpointTrxnInfo = OzoneManagerRatisUtils
- .getTrxnInfoFromCheckpoint(configuration, checkpointLocation);
-
+ .getTrxnInfoFromCheckpoint(configuration, omDbPath);
LOG.info("Installing checkpoint with OMTransactionInfo {}",
checkpointTrxnInfo);
-
- return installCheckpoint(leaderId, checkpointLocation, checkpointTrxnInfo);
+ return installCheckpoint(leaderId, checkpointDataDir, checkpointTrxnInfo);
}
- TermIndex installCheckpoint(String leaderId, Path checkpointLocation,
+ TermIndex installCheckpoint(String leaderId, Path checkpointDataDir,
TransactionInfo checkpointTrxnInfo) throws Exception {
long startTime = Time.monotonicNow();
File oldDBLocation = metadataManager.getStore().getDbLocation();
+ Path omDbPath = Paths.get(checkpointDataDir.toString(), OM_DB_NAME);
Review Comment:
Why do we need to worry about individual paths anymore? We have the
checkpoint location we just have to replace the entire omMetadata directory
paths whatever is there in the checkpoint. We don't need to know the contents
of the checkpoint anymore. The logic should be very simple.
Checkpoint has
2 directories om.db, db.snapshots
All we need to do is as and when we replacing these directories we backup
and in case of failure we just move back all the contents in backup directory
back. The logiic would be much simpler to handle and if later the server sends
more directories this code should be robust enough to handle everything as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]