prashantwason commented on a change in pull request #2064:
URL: https://github.com/apache/hudi/pull/2064#discussion_r491329825
##########
File path:
hudi-client/src/main/java/org/apache/hudi/table/HoodieTimelineArchiveLog.java
##########
@@ -194,6 +196,35 @@ public boolean archiveIfRequired(JavaSparkContext jsc)
throws IOException {
.collect(Collectors.groupingBy(i -> Pair.of(i.getTimestamp(),
HoodieInstant.getComparableAction(i.getAction()))));
+ // If metadata table is enabled, do not archive instants which are more
recent that the latest compaction
+ // of the metadata table. This is required for metadata table sync.
+ if (config.useFileListingMetadata()) {
+ Option<String> latestCompaction =
HoodieMetadata.getLatestCompactionTimestamp(config.getBasePath());
+ if (latestCompaction.isPresent()) {
+ LOG.info("Limiting archiving of instants to last compaction on
metadata table at " + latestCompaction.get());
+ instants = instants.filter(i ->
HoodieTimeline.compareTimestamps(i.getTimestamp(), HoodieTimeline.LESSER_THAN,
+ latestCompaction.get()));
+ } else {
+ LOG.info("Not arching instants as there is no compaction yet of the
metadata table");
+ instants = instants.filter(i -> false);
+ }
+ }
+
+ // For metadata tables, ensure commits >= latest compaction commit are
retained. This is required for
+ // metadata table sync.
+ if (HoodieMetadata.isMetadataTable(config.getBasePath())) {
+ Option<HoodieInstant> latestCompactionInstant =
+
table.getActiveTimeline().filterPendingCompactionTimeline().lastInstant();
+ if (latestCompactionInstant.isPresent()) {
+ LOG.info("Limiting archiving of instants on metadata table to last
compaction at " + latestCompactionInstant.get());
+ instants = instants.filter(i ->
HoodieTimeline.compareTimestamps(i.getTimestamp(), HoodieTimeline.LESSER_THAN,
+ latestCompactionInstant.get().getTimestamp()));
+ } else {
+ LOG.info("Not archiving instants on metdata table as there is no
compaction yet");
+ instants = instants.filter(i -> false);
Review comment:
Done
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]