zhangyue19921010 commented on a change in pull request #4078:
URL: https://github.com/apache/hudi/pull/4078#discussion_r777901212
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTimelineArchiveLog.java
##########
@@ -134,12 +161,199 @@ public boolean archiveIfRequired(HoodieEngineContext
context) throws IOException
LOG.info("No Instants to archive");
}
+ if (config.getArchiveAutoMergeEnable()) {
+ mergeArchiveFilesIfNecessary(context);
+ }
return success;
} finally {
close();
}
}
+ private void mergeArchiveFilesIfNecessary(HoodieEngineContext context)
throws IOException {
+ Path planPath = new Path(metaClient.getArchivePath(),
mergeArchivePlanName);
+ // Flush reminded content if existed and open a new write
+ reOpenWriter();
+ // List all archive files
+ FileStatus[] fsStatuses = metaClient.getFs().globStatus(
Review comment:
Nice catch here, I just find out that maybe it's important to keep the
original instants order of small archive files.
https://github.com/apache/hudi/blob/b5f05fd153df29a8be377404a14a0ced2f00b4bf/hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieArchivedTimeline.java#L219
When load archived instants, hoodie will use this order to optimize skipping
reading unnecessary archived files
https://github.com/apache/hudi/blob/b5f05fd153df29a8be377404a14a0ced2f00b4bf/hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieArchivedTimeline.java#L243
So just use the same order compactor here.
What do you think? :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]