yihua commented on code in PR #10325:
URL: https://github.com/apache/hudi/pull/10325#discussion_r1426822741


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/HoodieTimelineArchiver.java:
##########
@@ -342,11 +342,12 @@ private boolean deleteArchivedInstants(List<ActiveAction> 
activeActions, HoodieE
       );
     }
     if (!completedInstants.isEmpty()) {
-      context.foreach(
-          completedInstants,
-          instant -> activeTimeline.deleteInstantFileIfExists(instant),
-          Math.min(completedInstants.size(), 
config.getArchiveDeleteParallelism())
-      );
+      // Due to the concurrency between deleting completed instants and 
reading data,
+      // there may be hole in the timeline, which can lead to errors when 
reading data.
+      // Therefore, the concurrency of deleting completed instants is 
temporarily disabled,
+      // and instants are deleted in ascending order to prevent the occurrence 
of such holes.
+      completedInstants.stream()
+          .forEach(instant -> 
activeTimeline.deleteInstantFileIfExists(instant));
     }

Review Comment:
   A side note: Hudi makes sure the inflight data files are cleaned before 
archival happens on the relevant instants.  So the base files with the instant 
time before the start of the active timeline are always committed, thus the 
check logic in `isFileSliceCommitted(FileSlice)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to