bvaradar commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118140213


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, 
lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, 
lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && 
timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   @voonhous : thanks for the great explanation. From your comment, it looks 
like there is a difference  in the way Flink integration resolves valid files. 
   
   Regarding your comment: 
   ```Once the rollback completes, a partition might have a bucketId that maps 
to two fileGroups, breaking the 1 bucketId <> 1 fileGroup mapping contract.```
   
   As part of rollback, shouldn't the underlying file (which was newly created 
as part of failed commit gettting rolled back) get deleted ? Also, why does the 
fileId getting read if the commit did not finish ?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to