danny0405 commented on code in PR #14261:
URL: https://github.com/apache/hudi/pull/14261#discussion_r2544184992
##########
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/versioning/v2/ArchivedTimelineV2.java:
##########
@@ -242,6 +260,47 @@ private List<HoodieInstant> loadInstants(
return result;
}
+ /**
+ * Loads instants with a limit on the number of instants to load.
+ * This is used for limit-based loading where we only want to load the N
most recent instants.
+ */
+ private void loadInstantsWithLimit(int limit,
HoodieArchivedTimeline.LoadMode loadMode,
+ Function<GenericRecord, Boolean> commitsFilter) {
+ InstantsLoaderWithLimit loader = new InstantsLoaderWithLimit(limit,
loadMode);
+ timelineLoader.loadInstants(metaClient, null, loadMode, commitsFilter,
loader);
+ }
+
+ /**
+ * Callback to read instant details with a limit on the number of instants
to load.
+ * Extends BiConsumer to be used as a callback in the timeline loader.
+ * The BiConsumer interface allows it to be passed as a lambda/function that
accepts
+ * (instantTime, GenericRecord) pairs during the loading process.
+ */
+ private class InstantsLoaderWithLimit implements BiConsumer<String,
GenericRecord> {
+ private final int limit;
+ private final HoodieArchivedTimeline.LoadMode loadMode;
+ private volatile int loadedCount = 0;
+
+ private InstantsLoaderWithLimit(int limit, HoodieArchivedTimeline.LoadMode
loadMode) {
+ this.limit = limit;
+ this.loadMode = loadMode;
+ }
+
+ @Override
+ public void accept(String instantTime, GenericRecord record) {
+ if (loadedCount >= limit) {
+ return;
Review Comment:
even if we returns early here, the whole timeline files are still read and
decoded, we better stops the reading when the limit threshold reaches.
another issue is if `limit` is there, the read semantics should be the
`latest N` instants there, but the `timelineLoader.loadInstants(` has no
guarantee for this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]