brishi19791 commented on code in PR #782:
URL: https://github.com/apache/incubator-xtable/pull/782#discussion_r2759997447


##########
xtable-core/src/main/java/org/apache/xtable/delta/DeltaConversionSource.java:
##########
@@ -191,18 +193,58 @@ public CommitsBacklog<Long> getCommitsBacklog(
   }
 
   /*
-   * In Delta Lake, each commit is a self-describing one i.e. it contains list 
of new files while
-   * also containing list of files that were deleted. So, vacuum has no 
special effect on the
-   * incremental sync. Hence, existence of commit is the only check required.
+   * Following checks are performed:
+   * 1. Check if a commit exists at or before the provided instant.
+   * 2. Verify that commit files needed for incremental sync are still 
accessible.
+   *
+   * Delta Lake's VACUUM operation removes old JSON commit files from 
_delta_log/, which can
+   * break incremental sync even though commits are self-describing. This 
method attempts to
+   * access the commit chain to ensure files haven't been vacuumed.
    */
   @Override
   public boolean isIncrementalSyncSafeFrom(Instant instant) {
-    DeltaHistoryManager.Commit deltaCommitAtOrBeforeInstant =
-        deltaLog.history().getActiveCommitAtTime(Timestamp.from(instant), 
true, false, true);
-    // There is a chance earliest commit of the table is returned if the 
instant is before the
-    // earliest commit of the table, hence the additional check.
-    Instant deltaCommitInstant = 
Instant.ofEpochMilli(deltaCommitAtOrBeforeInstant.getTimestamp());
-    return deltaCommitInstant.equals(instant) || 
deltaCommitInstant.isBefore(instant);
+    try {
+      DeltaHistoryManager.Commit deltaCommitAtOrBeforeInstant =
+          deltaLog.history().getActiveCommitAtTime(Timestamp.from(instant), 
true, false, true);

Review Comment:
   3rd #argument seems to be "mustBeRecreatable" and is false. if we just make 
this to true wont it be enough?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to