Re: [PR] [HUDI-8449] Fix deletion of record from FILES partition on empty files list [hudi]

via GitHub Mon, 26 May 2025 08:51:44 -0700


vamsikarnika commented on code in PR #13346:
URL: https://github.com/apache/hudi/pull/13346#discussion_r2107592955



##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedTableMetadata.java:
##########
@@ -411,6 +411,61 @@ public void 
testRepeatedCleanActionsWithMetadataTableEnabled(final HoodieTableTy
     assertEquals(tableVersion, finalTableVersion.versionCode());
   }
 
+  @ParameterizedTest
+  @CsvSource({"COPY_ON_WRITE,6", "COPY_ON_WRITE,8", "MERGE_ON_READ,6", 
"MERGE_ON_READ,8"})
+  void testDeletePartitionKeyOnCleanPartition(final HoodieTableType tableType, 
int tableVersion) throws Exception {
+    initPath();
+    writeConfig = getWriteConfigBuilder(true, true, false)
+        .withMetadataConfig(HoodieMetadataConfig.newBuilder()
+            .enable(true)
+            .withMaxNumDeltaCommitsBeforeCompaction(6)
+            .build())
+        .build();
+    writeConfig.setValue(HoodieWriteConfig.WRITE_TABLE_VERSION, 
String.valueOf(tableVersion));
+    init(tableType, writeConfig);
+    String partition1 = "p1";
+    // Simulate two bulk insert operations adding two data files in partition 
"p1"
+    String instant1 = metaClient.createNewInstantTime();
+    testTable.doWriteOperation(instant1, BULK_INSERT, emptyList(), 
Collections.singletonList(partition1), 1);
+    String instant2 = metaClient.createNewInstantTime();
+    testTable.doWriteOperation(instant2, BULK_INSERT, emptyList(), 
Collections.singletonList(partition1), 1);
+
+    String partition2 = "p2";
+    testTable.doWriteOperation(
+        metaClient.createNewInstantTime(), BULK_INSERT, emptyList(), 
Collections.singletonList(partition2), 1);
+
+    final HoodieTableMetaClient metadataMetaClient = 
createMetaClient(metadataTableBasePath);
+    metadataMetaClient.reloadActiveTimeline();
+
+    assertEquals(0, getNumCompactions(metadataMetaClient));
+
+    String cleanInstant = metaClient.createNewInstantTime();
+    HoodieCleanMetadata cleanMetadata = 
testTable.doCleanBasedOnPartitions(cleanInstant, 
Collections.singletonList(partition1));
+
+    while (getNumCompactions(metadataMetaClient) == 0) {
+      // Write until the compaction happens in the metadata table

Review Comment:
   During compaction, deleted records won’t be written to the compacted file, 
so there won’t be any entry for that key. Without compaction, the 
getRecordsByKeyPrefixes method will still read the deleted record from the log 
files — it does apply pre-combine and marks the record as deleted, but the 
record will still be present.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-8449] Fix deletion of record from FILES partition on empty files list [hudi]

Reply via email to