hgudladona commented on issue #12298:
URL: https://github.com/apache/hudi/issues/12298#issuecomment-2491217493

   @ad1happy2go This patch will still not solve this problem. If you follow the 
code path `getLatestFileSlicesBeforeOrOn` will filters the file slices using 
function `getLatestFileSliceFilteringUncommittedFiles` which filters using 
`filterUncommittedFiles` . Looking into this function 
   
   ```
    private Stream<FileSlice> filterUncommittedFiles(FileSlice fileSlice, 
boolean includeEmptyFileSlice) {
       Option<HoodieBaseFile> committedBaseFile = 
fileSlice.getBaseFile().isPresent() && 
completionTimeQueryView.isCompleted(fileSlice.getBaseInstantTime()) ? 
fileSlice.getBaseFile() : Option.empty();
       List<HoodieLogFile> committedLogFiles = 
fileSlice.getLogFiles().filter(logFile -> 
completionTimeQueryView.isCompleted(logFile.getDeltaCommitTime())).collect(Collectors.toList());
       if ((fileSlice.getBaseFile().isPresent() && 
!committedBaseFile.isPresent())
           || committedLogFiles.size() != fileSlice.getLogFiles().count()) {
         LOG.debug("File Slice (" + fileSlice + ") has uncommitted files.");
         // A file is filtered out of the file-slice if the corresponding
         // instant has not completed yet.
         FileSlice transformed = new FileSlice(fileSlice.getPartitionPath(), 
fileSlice.getBaseInstantTime(), fileSlice.getFileId());
         committedBaseFile.ifPresent(transformed::setBaseFile);
         committedLogFiles.forEach(transformed::addLogFile);
         if (transformed.isEmpty() && !includeEmptyFileSlice) {
           return Stream.of();
         }
         return Stream.of(transformed);
       }
       return Stream.of(fileSlice);
     }
   
   ...
   
     public boolean isCompleted(String instantTime) {
       return this.startToCompletionInstantTimeMap.containsKey(instantTime)
           || HoodieTimeline.compareTimestamps(instantTime, LESSER_THAN, 
this.firstNonSavepointCommit);
     }
     ```
     
   If a file slice base instant time is less than firstNonSavepointCommit, 
although the not in active timeline its treated as completed which is pretty 
similar to the current behavior. Kindly, go through the scenario I mentioned 
one more time and suggest of this is the right patch? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to