yihua commented on a change in pull request #4821:
URL: https://github.com/apache/hudi/pull/4821#discussion_r832536233



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java
##########
@@ -429,25 +428,21 @@ public void mergeArchiveFiles(List<FileStatus> 
compactCandidate) throws IOExcept
         .collect(Collectors.groupingBy(i -> Pair.of(i.getTimestamp(),
             HoodieInstant.getComparableAction(i.getAction()))));
 
-    // If metadata table is enabled, do not archive instants which are more 
recent than the last compaction on the
-    // metadata table.
-    if (config.isMetadataTableEnabled()) {
-      try (HoodieTableMetadata tableMetadata = 
HoodieTableMetadata.create(table.getContext(), config.getMetadataConfig(),
-          config.getBasePath(), 
FileSystemViewStorageConfig.SPILLABLE_DIR.defaultValue())) {
-        Option<String> latestCompactionTime = 
tableMetadata.getLatestCompactionTime();
-        if (!latestCompactionTime.isPresent()) {
-          LOG.info("Not archiving as there is no compaction yet on the 
metadata table");
-          instants = Stream.empty();
-        } else {
-          LOG.info("Limiting archiving of instants to latest compaction on 
metadata table at " + latestCompactionTime.get());
-          instants = instants.filter(instant -> 
HoodieTimeline.compareTimestamps(instant.getTimestamp(), 
HoodieTimeline.LESSER_THAN,
-              latestCompactionTime.get()));
-        }
-      } catch (Exception e) {
-        throw new HoodieException("Error limiting instant archival based on 
metadata table", e);
+    // If this is a metadata table, do not archive the commits that live in 
data set
+    // active timeline. This is required by metadata table,
+    // see HoodieTableMetadataUtil#processRollbackMetadata for details.
+    if (HoodieTableMetadata.isMetadataTable(config.getBasePath())) {

Review comment:
       @danny0405 Let's add this new logic on top of the existing metadata 
table specific logic, i.e., checking for last compaction on the metadata table 
and land the fix soon, without changing existing logic.
   
   I understand you have concern around whether we need the check around  
compaction.  We can take that to a separate PR for discussion.  The goal here 
is to land this fix soon so we can do another round of testing on metadata 
table.  My worry is that the checking for last compaction on the metadata table 
is still needed for some cases, and if we remove it, we may introduce new 
problem before the last minute of the release cut, so for safety we can keep it 
for now.  WDYT?
   
   If you're busy, I can take this up, revise the PR, and land it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to