majian1998 commented on code in PR #10325:
URL: https://github.com/apache/hudi/pull/10325#discussion_r1426945258
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/HoodieTimelineArchiver.java:
##########
@@ -342,11 +342,12 @@ private boolean deleteArchivedInstants(List<ActiveAction>
activeActions, HoodieE
);
}
if (!completedInstants.isEmpty()) {
- context.foreach(
- completedInstants,
- instant -> activeTimeline.deleteInstantFileIfExists(instant),
- Math.min(completedInstants.size(),
config.getArchiveDeleteParallelism())
- );
+ // Due to the concurrency between deleting completed instants and
reading data,
+ // there may be hole in the timeline, which can lead to errors when
reading data.
+ // Therefore, the concurrency of deleting completed instants is
temporarily disabled,
+ // and instants are deleted in ascending order to prevent the occurrence
of such holes.
+ completedInstants.stream()
+ .forEach(instant ->
activeTimeline.deleteInstantFileIfExists(instant));
}
Review Comment:
Yes! @yihua You got it! Haha. Just as @danny0405 pointed out, that piece of
code `isFileSliceCommitted(FileSlice)` will have issues during concurrent
deletion of completed instants. The order of deletion doesn't matter for
inflight ones.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]