suryaprasanna commented on code in PR #9007:
URL: https://github.com/apache/hudi/pull/9007#discussion_r1244878470
##########
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimelineDiffHelper.java:
##########
@@ -74,19 +74,62 @@ public static TimelineDiffResult
getNewInstantsForIncrementalSync(HoodieTimeline
newTimeline.getInstantsAsStream().filter(instant ->
!oldTimelineInstants.contains(instant)).forEach(newInstants::add);
+ // Check for log compaction commits completed or removed.
List<Pair<HoodieInstant, HoodieInstant>> logCompactionInstants =
getPendingLogCompactionTransitions(oldTimeline, newTimeline);
- List<HoodieInstant> finishedOrRemovedLogCompactionInstants =
logCompactionInstants.stream()
+ List<Pair<HoodieInstant, Boolean>>
finishedOrRemovedLogCompactionInstants = logCompactionInstants.stream()
.filter(instantPair -> !instantPair.getKey().isCompleted()
&& (instantPair.getValue() == null ||
instantPair.getValue().isCompleted()))
- .map(Pair::getKey).collect(Collectors.toList());
- return new TimelineDiffResult(newInstants, finishedCompactionInstants,
finishedOrRemovedLogCompactionInstants, true);
+ .map(instantPair -> (instantPair.getValue() == null)
+ ? Pair.of(instantPair.getKey(), false) :
Pair.of(instantPair.getKey(), true))
Review Comment:
Compaction plans are immutable plans, so once they are created even though
the execution of the plan is not completed, new log files are still written to
new file slice. Whereas Log compaction and clustering can be mutable, they can
be removed so the logic varies for them.
Compaction plans are removed only during a restore operation, so if a
compaction commit is removed and another tries to do incremental sync in such
cases incremental file sync fails it we are ok in doing full sync. Whereas for
other 2 action types, incremental file sync should be able to handle the
transitions from inflight to completed or inflight to rollback etc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]