prashantwason commented on code in PR #8837:
URL: https://github.com/apache/hudi/pull/8837#discussion_r1266720620


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -910,26 +914,25 @@ public void update(HoodieCleanMetadata cleanMetadata, 
String instantTime) {
   public void update(HoodieRestoreMetadata restoreMetadata, String 
instantTime) {
     dataMetaClient.reloadActiveTimeline();
 
-    // Since the restore has completed on the dataset, the latest write 
timeline instant is the one to which the
-    // restore was performed. If this is not present, then the restore was 
performed to the earliest commit. This
-    // could happen in case of bootstrap followed by rollback e.g. 
TestBootstrap#testFullBootstrapWithRegexModeWithUpdatesMOR.
-    Option<HoodieInstant> restoreInstant = 
dataMetaClient.getActiveTimeline().getWriteTimeline().lastInstant();
-    final String restoreToInstantTime;
-    if (restoreInstant.isPresent()) {
-      restoreToInstantTime = restoreInstant.get().getTimestamp();
-    } else {
-      restoreToInstantTime = 
metadataMetaClient.getActiveTimeline().filterCompletedInstants().firstInstant().get().getTimestamp();
+    // Fetch the commit to restore to (savepointed commit time)
+    HoodieInstant restoreInstant = new 
HoodieInstant(HoodieInstant.State.REQUESTED, HoodieTimeline.RESTORE_ACTION, 
instantTime);
+    HoodieInstant requested = 
HoodieTimeline.getRestoreRequestedInstant(restoreInstant);

Review Comment:
   This does not seem any different from the restoreInstant.
   
   Can possibly remove this and the introduced function 
getRestoreRequestedInstant



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -910,26 +914,25 @@ public void update(HoodieCleanMetadata cleanMetadata, 
String instantTime) {
   public void update(HoodieRestoreMetadata restoreMetadata, String 
instantTime) {
     dataMetaClient.reloadActiveTimeline();
 
-    // Since the restore has completed on the dataset, the latest write 
timeline instant is the one to which the
-    // restore was performed. If this is not present, then the restore was 
performed to the earliest commit. This
-    // could happen in case of bootstrap followed by rollback e.g. 
TestBootstrap#testFullBootstrapWithRegexModeWithUpdatesMOR.
-    Option<HoodieInstant> restoreInstant = 
dataMetaClient.getActiveTimeline().getWriteTimeline().lastInstant();
-    final String restoreToInstantTime;
-    if (restoreInstant.isPresent()) {
-      restoreToInstantTime = restoreInstant.get().getTimestamp();
-    } else {
-      restoreToInstantTime = 
metadataMetaClient.getActiveTimeline().filterCompletedInstants().firstInstant().get().getTimestamp();
+    // Fetch the commit to restore to (savepointed commit time)
+    HoodieInstant restoreInstant = new 
HoodieInstant(HoodieInstant.State.REQUESTED, HoodieTimeline.RESTORE_ACTION, 
instantTime);
+    HoodieInstant requested = 
HoodieTimeline.getRestoreRequestedInstant(restoreInstant);
+    HoodieRestorePlan restorePlan = null;
+    try {
+      restorePlan = TimelineMetadataUtils.deserializeAvroMetadata(
+          
dataMetaClient.getActiveTimeline().readRestoreInfoAsBytes(requested).get(), 
HoodieRestorePlan.class);
+    } catch (IOException e) {
+      throw new HoodieIOException("Deserialization of rollback plan failed ", 
e);

Review Comment:
   Good to have the restoreInstant in this log so can debug which plan is 
missing/corrupted.



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -910,26 +914,25 @@ public void update(HoodieCleanMetadata cleanMetadata, 
String instantTime) {
   public void update(HoodieRestoreMetadata restoreMetadata, String 
instantTime) {
     dataMetaClient.reloadActiveTimeline();
 
-    // Since the restore has completed on the dataset, the latest write 
timeline instant is the one to which the
-    // restore was performed. If this is not present, then the restore was 
performed to the earliest commit. This
-    // could happen in case of bootstrap followed by rollback e.g. 
TestBootstrap#testFullBootstrapWithRegexModeWithUpdatesMOR.
-    Option<HoodieInstant> restoreInstant = 
dataMetaClient.getActiveTimeline().getWriteTimeline().lastInstant();
-    final String restoreToInstantTime;
-    if (restoreInstant.isPresent()) {
-      restoreToInstantTime = restoreInstant.get().getTimestamp();
-    } else {
-      restoreToInstantTime = 
metadataMetaClient.getActiveTimeline().filterCompletedInstants().firstInstant().get().getTimestamp();
+    // Fetch the commit to restore to (savepointed commit time)
+    HoodieInstant restoreInstant = new 
HoodieInstant(HoodieInstant.State.REQUESTED, HoodieTimeline.RESTORE_ACTION, 
instantTime);
+    HoodieInstant requested = 
HoodieTimeline.getRestoreRequestedInstant(restoreInstant);
+    HoodieRestorePlan restorePlan = null;
+    try {
+      restorePlan = TimelineMetadataUtils.deserializeAvroMetadata(
+          
dataMetaClient.getActiveTimeline().readRestoreInfoAsBytes(requested).get(), 
HoodieRestorePlan.class);
+    } catch (IOException e) {
+      throw new HoodieIOException("Deserialization of rollback plan failed ", 
e);
     }
-
-    // We cannot restore to before the oldest compaction on MDT as we don't 
have the base files before that time.
-    Option<HoodieInstant> oldestCompaction = 
metadataMetaClient.getCommitTimeline().filterCompletedInstants().firstInstant();
-    if (oldestCompaction.isPresent()) {
-      if (HoodieTimeline.LESSER_THAN_OR_EQUALS.test(restoreToInstantTime, 
oldestCompaction.get().getTimestamp())) {
-        String msg = String.format("Cannot restore MDT to %s because it is 
before the oldest compaction at %s", restoreToInstantTime,
-            oldestCompaction.get().getTimestamp()) + ". Please delete MDT and 
restore again";
-        LOG.error(msg);
-        throw new HoodieMetadataException(msg);
-      }
+    final String restoreToInstantTime = 
restorePlan.getSavepointToRestoreTimestamp();
+
+    // fetch the earliest commit to retain and ensure the base file prior to 
the time to restore is present
+    List<HoodieFileGroup> filesGroups = 
metadata.getMetadataFileSystemView().getAllFileGroups(MetadataPartitionType.FILES.getPartitionPath()).collect(Collectors.toList());
+    boolean canRestore = filesGroups.get(0).getAllFileSlices().map(fileSlice 
-> fileSlice.getBaseInstantTime()).anyMatch(
+        instantTime1 -> HoodieTimeline.compareTimestamps(instantTime1, 
LESSER_THAN_OR_EQUALS, restoreToInstantTime));
+    if (!canRestore) {
+      throw new HoodieMetadataException("Can't restore since there is no base 
file in MDT lesser than the commit to restore to. "

Review Comment:
   Maybe add the restoreToInstantTime to this log for help in debugging.
   



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -910,26 +914,25 @@ public void update(HoodieCleanMetadata cleanMetadata, 
String instantTime) {
   public void update(HoodieRestoreMetadata restoreMetadata, String 
instantTime) {
     dataMetaClient.reloadActiveTimeline();
 
-    // Since the restore has completed on the dataset, the latest write 
timeline instant is the one to which the
-    // restore was performed. If this is not present, then the restore was 
performed to the earliest commit. This
-    // could happen in case of bootstrap followed by rollback e.g. 
TestBootstrap#testFullBootstrapWithRegexModeWithUpdatesMOR.
-    Option<HoodieInstant> restoreInstant = 
dataMetaClient.getActiveTimeline().getWriteTimeline().lastInstant();
-    final String restoreToInstantTime;
-    if (restoreInstant.isPresent()) {
-      restoreToInstantTime = restoreInstant.get().getTimestamp();
-    } else {
-      restoreToInstantTime = 
metadataMetaClient.getActiveTimeline().filterCompletedInstants().firstInstant().get().getTimestamp();
+    // Fetch the commit to restore to (savepointed commit time)
+    HoodieInstant restoreInstant = new 
HoodieInstant(HoodieInstant.State.REQUESTED, HoodieTimeline.RESTORE_ACTION, 
instantTime);
+    HoodieInstant requested = 
HoodieTimeline.getRestoreRequestedInstant(restoreInstant);
+    HoodieRestorePlan restorePlan = null;
+    try {
+      restorePlan = TimelineMetadataUtils.deserializeAvroMetadata(
+          
dataMetaClient.getActiveTimeline().readRestoreInfoAsBytes(requested).get(), 
HoodieRestorePlan.class);
+    } catch (IOException e) {
+      throw new HoodieIOException("Deserialization of rollback plan failed ", 
e);
     }
-
-    // We cannot restore to before the oldest compaction on MDT as we don't 
have the base files before that time.
-    Option<HoodieInstant> oldestCompaction = 
metadataMetaClient.getCommitTimeline().filterCompletedInstants().firstInstant();
-    if (oldestCompaction.isPresent()) {
-      if (HoodieTimeline.LESSER_THAN_OR_EQUALS.test(restoreToInstantTime, 
oldestCompaction.get().getTimestamp())) {
-        String msg = String.format("Cannot restore MDT to %s because it is 
before the oldest compaction at %s", restoreToInstantTime,
-            oldestCompaction.get().getTimestamp()) + ". Please delete MDT and 
restore again";
-        LOG.error(msg);
-        throw new HoodieMetadataException(msg);
-      }
+    final String restoreToInstantTime = 
restorePlan.getSavepointToRestoreTimestamp();
+
+    // fetch the earliest commit to retain and ensure the base file prior to 
the time to restore is present
+    List<HoodieFileGroup> filesGroups = 
metadata.getMetadataFileSystemView().getAllFileGroups(MetadataPartitionType.FILES.getPartitionPath()).collect(Collectors.toList());
+    boolean canRestore = filesGroups.get(0).getAllFileSlices().map(fileSlice 
-> fileSlice.getBaseInstantTime()).anyMatch(

Review Comment:
   Can we implement this check without get(0)?
   
   get(0) assumes that files partition has a single fileGroup. If we ever move 
to more than 1 partitions for file group, it would be hard to remember to 
change this too and this would be a bug. So best to not use get(0) here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to