nsivabalan commented on a change in pull request #4078:
URL: https://github.com/apache/hudi/pull/4078#discussion_r768825023



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java
##########
@@ -249,6 +249,18 @@
           + "record size estimate compute dynamically based on commit 
metadata. "
           + " This is critical in computing the insert parallelism and 
bin-packing inserts into small files.");
 
+  public static final ConfigProperty<String> ARCHIVE_FILES_TO_KEEP_PROP = 
ConfigProperty
+      .key("hoodie.keep.archive.files")
+      .defaultValue("10")
+      .withDocumentation("The numbers of kept archive files under archived");
+
+  public static final ConfigProperty<String> CLEAN_ARCHIVE_FILE_ENABLE_DROP = 
ConfigProperty
+      .key("hoodie.archive.clean.enable")

Review comment:
       this confuses me with regular clean. Can we call it as  
"hoodie.auto.trim.archive.files" or "hoodie.auto.delete.archive.files"  or 
something on that end.  
   and and "hoodie.max.archive.files" for the previous config. 

##########
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/TestHoodieTimelineArchiveLog.java
##########
@@ -183,6 +217,24 @@ public void testArchiveTableWithArchival(boolean 
enableMetadata) throws Exceptio
     }
   }
 
+  @ParameterizedTest
+  @MethodSource("testArchiveTableWithArchivalCleanUp")
+  public void testArchiveTableWithArchivalCleanUp(boolean enableMetadata, 
boolean enableArchiveClean, int archiveFilesToKeep) throws Exception {
+    HoodieWriteConfig writeConfig = 
initTestTableAndGetWriteConfig(enableMetadata, 2, 3, 2, enableArchiveClean, 
archiveFilesToKeep);
+    for (int i = 1; i < 10; i++) {
+      testTable.doWriteOperation("0000000" + i, WriteOperationType.UPSERT, i 
== 1 ? Arrays.asList("p1", "p2") : Collections.emptyList(), Arrays.asList("p1", 
"p2"), 2);
+      // trigger archival
+      archiveAndGetCommitsList(writeConfig);
+    }
+    String archivePath = metaClient.getArchivePath();
+    RemoteIterator<LocatedFileStatus> iter = metaClient.getFs().listFiles(new 
Path(archivePath), false);
+    ArrayList<LocatedFileStatus> files = new ArrayList<>();
+    while (iter.hasNext()) {
+      files.add(iter.next());
+    }
+    assertEquals(archiveFilesToKeep, files.size());

Review comment:
       Can we also add assertion that earliest files are deleted and not latest 
ones in archive folder.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to