bvaradar commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118140213
##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
* some log files, that are based off a commit or delta commit.
*/
private boolean isFileSliceCommitted(FileSlice slice) {
- if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS,
lastInstant.get().getTimestamp())) {
+ if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN,
lastInstant.get().getTimestamp())) {
+ return false;
+ }
+
+ if (!slice.getBaseFile().isPresent() &&
timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {
Review Comment:
@voonhous : thanks for the great explanation. From your comment, it looks
like there is a difference in the way Flink integration resolves valid files.
Regarding your comment:
```Once the rollback completes, a partition might have a bucketId that maps
to two fileGroups, breaking the 1 bucketId <> 1 fileGroup mapping contract.```
As part of rollback, shouldn't the underlying file (which was newly created
as part of failed commit gettting rolled back) get deleted ? Also, why does the
fileId getting read if the commit did not finish ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]