hanishakoneru commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438353082
##########
File path:
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java
##########
@@ -259,16 +261,25 @@ public void start(OzoneConfiguration configuration)
throws IOException {
rocksDBConfiguration.setSyncOption(true);
}
- DBStoreBuilder dbStoreBuilder = DBStoreBuilder.newBuilder(configuration,
- rocksDBConfiguration).setName(OM_DB_NAME)
- .setPath(Paths.get(metaDir.getPath()));
+ this.store = loadDB(configuration, metaDir);
- this.store = addOMTablesAndCodecs(dbStoreBuilder).build();
+ // This value will be used internally, not to be exposed to end users.
Review comment:
We can remove this comment now.
##########
File path:
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##########
@@ -3033,32 +3024,47 @@ public TermIndex installSnapshot(String leaderId) {
DBCheckpoint omDBcheckpoint = getDBCheckpointFromLeader(leaderId);
Path newDBlocation = omDBcheckpoint.getCheckpointLocation();
- // Check if current ratis log index is smaller than the downloaded
- // snapshot index. If yes, proceed by stopping the ratis server so that
- // the OM state can be re-initialized. If no, then do not proceed with
- // installSnapshot.
+ LOG.info("Downloaded checkpoint from Leader {}, in to the location {}",
+ leaderId, newDBlocation);
+
long lastAppliedIndex = omRatisServer.getLastAppliedTermIndex().getIndex();
- long checkpointSnapshotIndex = omDBcheckpoint.getRatisSnapshotIndex();
- long checkpointSnapshotTermIndex =
- omDBcheckpoint.getRatisSnapshotTerm();
- if (checkpointSnapshotIndex <= lastAppliedIndex) {
- LOG.error("Failed to install checkpoint from OM leader: {}. The last " +
- "applied index: {} is greater than or equal to the checkpoint's"
- + " " +
- "snapshot index: {}. Deleting the downloaded checkpoint {}",
- leaderId,
- lastAppliedIndex, checkpointSnapshotIndex,
+
+ // Check if current ratis log index is smaller than the downloaded
+ // checkpoint transaction index. If yes, proceed by stopping the ratis
+ // server so that the OM state can be re-initialized. If no, then do not
+ // proceed with installSnapshot.
+
+ OMTransactionInfo omTransactionInfo = null;
+
+ Path dbDir = newDBlocation.getParent();
+ if (dbDir == null) {
+ LOG.error("Incorrect DB location path {} received from checkpoint.",
newDBlocation);
- try {
- FileUtils.deleteFully(newDBlocation);
- } catch (IOException e) {
- LOG.error("Failed to fully delete the downloaded DB checkpoint {} " +
- "from OM leader {}.", newDBlocation,
- leaderId, e);
- }
return null;
}
+ try {
+ omTransactionInfo =
+ OzoneManagerRatisUtils.getTransactionInfoFromDownloadedSnapshot(
+ configuration, dbDir);
+ } catch (Exception ex) {
+ LOG.error("Failed during opening downloaded snapshot from " +
+ "{} to obtain transaction index", newDBlocation, ex);
+ return null;
+ }
+
+ boolean canProceed =
+ OzoneManagerRatisUtils.verifyTransactionInfo(omTransactionInfo,
+ lastAppliedIndex, leaderId, newDBlocation);
+
Review comment:
The lastAppliedIndex could have been updated between its assignment and
the canProceed check. This check should be synchronous. Or at least the
assignment should happen after reading the transactionInfo from DB.
##########
File path:
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##########
@@ -3168,8 +3172,8 @@ File replaceOMDBWithCheckpoint(long lastAppliedIndex,
Path checkpointPath)
* All the classes which use/ store MetadataManager should also be updated
* with the new MetadataManager instance.
*/
- void reloadOMState(long newSnapshotIndex,
- long newSnapShotTermIndex) throws IOException {
+ void reloadOMState(long newSnapshotIndex, long newSnapShotTermIndex)
+ throws IOException {
Review comment:
NIT: SnapShot -> Snapshot
##########
File path:
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OzoneManagerSnapshotProvider.java
##########
@@ -112,16 +112,16 @@ public OzoneManagerSnapshotProvider(ConfigurationSource
conf,
*/
public DBCheckpoint getOzoneManagerDBSnapshot(String leaderOMNodeID)
throws IOException {
- String snapshotFileName = OM_SNAPSHOT_DB + "_" +
System.currentTimeMillis();
- File targetFile = new File(omSnapshotDir, snapshotFileName + ".tar.gz");
+ String snapshotTime = Long.toString(System.currentTimeMillis());
+ String snapshotFileName = Paths.get(omSnapshotDir.getAbsolutePath(),
+ snapshotTime, OM_DB_NAME).toFile().getAbsolutePath();
+ File targetFile = new File(snapshotFileName + ".tar.gz");
Review comment:
Can we revert this now that we don't need this to loadDB.
##########
File path:
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##########
@@ -338,20 +352,19 @@ public void unpause(long newLastAppliedSnaphsotIndex,
}
/**
- * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
- * is the log index corresponding to the last applied transaction on the OM
- * State Machine.
+ * Take OM Ratis snapshot is a dummy operation as when double buffer
+ * flushes the lastAppliedIndex is flushed to DB and that is used as
+ * snapshot index.
*
* @return the last applied index on the state machine which has been
* stored in the snapshot file.
*/
@Override
public long takeSnapshot() throws IOException {
- LOG.info("Saving Ratis snapshot on the OM.");
- if (ozoneManager != null) {
- return ozoneManager.saveRatisSnapshot().getIndex();
- }
- return 0;
+ LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+ long lastAppliedIndex = getLastAppliedTermIndex().getIndex();
+ ozoneManager.getMetadataManager().getStore().flush();
Review comment:
We should update the snapshotInfo also here.
##########
File path:
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##########
@@ -515,13 +528,12 @@ private synchronized void
computeAndUpdateLastAppliedIndex(
}
}
- public void updateLastAppliedIndexWithSnaphsotIndex() {
+ public void updateLastAppliedIndexWithSnaphsotIndex() throws IOException {
Review comment:
Can we rename this method to something like loadSnapshotInfo or
loadSnapshotInfoFromDB?
Also I think we should move the logic inside reinitialize() here and call
this function from reinitialize(). It feels offbeat to call reinitialize()
during initialization.
##########
File path:
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##########
@@ -338,20 +352,19 @@ public void unpause(long newLastAppliedSnaphsotIndex,
}
/**
- * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
- * is the log index corresponding to the last applied transaction on the OM
- * State Machine.
+ * Take OM Ratis snapshot is a dummy operation as when double buffer
+ * flushes the lastAppliedIndex is flushed to DB and that is used as
+ * snapshot index.
*
* @return the last applied index on the state machine which has been
* stored in the snapshot file.
*/
@Override
public long takeSnapshot() throws IOException {
- LOG.info("Saving Ratis snapshot on the OM.");
- if (ozoneManager != null) {
- return ozoneManager.saveRatisSnapshot().getIndex();
- }
- return 0;
+ LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+ long lastAppliedIndex = getLastAppliedTermIndex().getIndex();
+ ozoneManager.getMetadataManager().getStore().flush();
+ return lastAppliedIndex;
Review comment:
> We should return the flushed TransactionInfo#logIndex here too.
TransactionIndex returned via getLastAppliedIndex() might not have been flushed
to DB yet. And logs could be purged upto this index. And if OM crashes before
the transactions are flushed to disk by OMDoubleBuffer, there could be data
loss.
DBStore.flush() would not flush all the transactions in OMDoubleBuffer to
disk, right?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]