voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118454395
##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
* some log files, that are based off a commit or delta commit.
*/
private boolean isFileSliceCommitted(FileSlice slice) {
- if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS,
lastInstant.get().getTimestamp())) {
+ if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN,
lastInstant.get().getTimestamp())) {
+ return false;
+ }
+
+ if (!slice.getBaseFile().isPresent() &&
timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {
Review Comment:
@bvaradar @danny0405 there seems to be some discrepancy the 2 rollback
strategies below:
1. rollback with marker - will NOT delete log files that were created by the
commit to rollback
2. rollback via listing - will delete log files that were created by the
commit to rollback
``` java
// Log file can be deleted if the commit to rollback is also the commit
that created the fileGroup
if (latestLogFileOption.isPresent() && Objects.equals(baseCommitTime,
instantToRollback.getTimestamp())) {
Path fullDeletePath = new Path(partitionPath,
latestLogFileOption.get().getFileName());
return new HoodieRollbackRequest(relativePartitionPath, EMPTY_STRING,
EMPTY_STRING,
Collections.singletonList(fullDeletePath.toString()),
Collections.emptyMap());
}
```
Perhaps, a fix like this would be sufficient to resolve the discrepancy
between how these 2 strategies rollback log files?
Parallel fix:
https://github.com/apache/hudi/compare/master...voonhous:hudi:HUDI-5822_rollback_fix?expand=1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]