vamsikarnika commented on code in PR #13346:
URL: https://github.com/apache/hudi/pull/13346#discussion_r2107592955
##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedTableMetadata.java:
##########
@@ -411,6 +411,61 @@ public void
testRepeatedCleanActionsWithMetadataTableEnabled(final HoodieTableTy
assertEquals(tableVersion, finalTableVersion.versionCode());
}
+ @ParameterizedTest
+ @CsvSource({"COPY_ON_WRITE,6", "COPY_ON_WRITE,8", "MERGE_ON_READ,6",
"MERGE_ON_READ,8"})
+ void testDeletePartitionKeyOnCleanPartition(final HoodieTableType tableType,
int tableVersion) throws Exception {
+ initPath();
+ writeConfig = getWriteConfigBuilder(true, true, false)
+ .withMetadataConfig(HoodieMetadataConfig.newBuilder()
+ .enable(true)
+ .withMaxNumDeltaCommitsBeforeCompaction(6)
+ .build())
+ .build();
+ writeConfig.setValue(HoodieWriteConfig.WRITE_TABLE_VERSION,
String.valueOf(tableVersion));
+ init(tableType, writeConfig);
+ String partition1 = "p1";
+ // Simulate two bulk insert operations adding two data files in partition
"p1"
+ String instant1 = metaClient.createNewInstantTime();
+ testTable.doWriteOperation(instant1, BULK_INSERT, emptyList(),
Collections.singletonList(partition1), 1);
+ String instant2 = metaClient.createNewInstantTime();
+ testTable.doWriteOperation(instant2, BULK_INSERT, emptyList(),
Collections.singletonList(partition1), 1);
+
+ String partition2 = "p2";
+ testTable.doWriteOperation(
+ metaClient.createNewInstantTime(), BULK_INSERT, emptyList(),
Collections.singletonList(partition2), 1);
+
+ final HoodieTableMetaClient metadataMetaClient =
createMetaClient(metadataTableBasePath);
+ metadataMetaClient.reloadActiveTimeline();
+
+ assertEquals(0, getNumCompactions(metadataMetaClient));
+
+ String cleanInstant = metaClient.createNewInstantTime();
+ HoodieCleanMetadata cleanMetadata =
testTable.doCleanBasedOnPartitions(cleanInstant,
Collections.singletonList(partition1));
+
+ while (getNumCompactions(metadataMetaClient) == 0) {
+ // Write until the compaction happens in the metadata table
Review Comment:
During compaction, deleted records won’t be written to the compacted file,
so there won’t be any entry for that key. Without compaction, the
getRecordsByKeyPrefixes method will still read the deleted record from the log
files — it does apply pre-combine and marks the record as deleted, but the
record will still be present.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]