codope commented on code in PR #12525:
URL: https://github.com/apache/hudi/pull/12525#discussion_r1905159369
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -2206,44 +2306,161 @@ public static boolean
validateDataTypeForSecondaryIndex(List<String> sourceField
});
}
- public static HoodieData<HoodieRecord>
readSecondaryKeysFromBaseFiles(HoodieEngineContext engineContext,
-
List<Pair<String, Pair<String, List<String>>>> partitionFiles,
- int
secondaryIndexMaxParallelism,
- String
activeModule, HoodieTableMetaClient metaClient, EngineType engineType,
-
HoodieIndexDefinition indexDefinition) {
- if (partitionFiles.isEmpty()) {
- return engineContext.emptyHoodieData();
+ /**
+ * Converts the write stats to secondary index records.
+ *
+ * @param allWriteStats list of write stats
+ * @param instantTime instant time
+ * @param indexDefinition secondary index definition
+ * @param metadataConfig metadata config
+ * @param fsView file system view as of instant time
+ * @param dataMetaClient data table meta client
+ * @param engineContext engine context
+ * @param engineType engine type (e.g. SPARK, FLINK or JAVA)
+ * @return {@link HoodieData} of {@link HoodieRecord} to be updated in the
metadata table for the given secondary index partition
+ */
+ public static HoodieData<HoodieRecord>
convertWriteStatsToSecondaryIndexRecords(List<HoodieWriteStat> allWriteStats,
Review Comment:
Most of the tests in `TestSecondaryIndex` and `TestSecondaryIndexPruning`
covers a lot of cases cases with single or multiple log files in the same
filegroup, with updates and deletes for single and multiple keys. But, I have
added a UT for this method in `TestMetadataUtilRLIandSIRecordGeneration`, which
also tests the SI initialization util method.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]