nsivabalan commented on code in PR #12525:
URL: https://github.com/apache/hudi/pull/12525#discussion_r1904695791


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -2206,44 +2306,161 @@ public static boolean 
validateDataTypeForSecondaryIndex(List<String> sourceField
     });
   }
 
-  public static HoodieData<HoodieRecord> 
readSecondaryKeysFromBaseFiles(HoodieEngineContext engineContext,
-                                                                        
List<Pair<String, Pair<String, List<String>>>> partitionFiles,
-                                                                        int 
secondaryIndexMaxParallelism,
-                                                                        String 
activeModule, HoodieTableMetaClient metaClient, EngineType engineType,
-                                                                        
HoodieIndexDefinition indexDefinition) {
-    if (partitionFiles.isEmpty()) {
-      return engineContext.emptyHoodieData();
+  /**
+   * Converts the write stats to secondary index records.
+   *
+   * @param allWriteStats   list of write stats
+   * @param instantTime     instant time
+   * @param indexDefinition secondary index definition
+   * @param metadataConfig  metadata config
+   * @param fsView          file system view as of instant time
+   * @param dataMetaClient  data table meta client
+   * @param engineContext   engine context
+   * @param engineType      engine type (e.g. SPARK, FLINK or JAVA)
+   * @return {@link HoodieData} of {@link HoodieRecord} to be updated in the 
metadata table for the given secondary index partition
+   */
+  public static HoodieData<HoodieRecord> 
convertWriteStatsToSecondaryIndexRecords(List<HoodieWriteStat> allWriteStats,

Review Comment:
   Can we write UTs for this method. this is critical to get it right. I 
remember we ran into few issues that Lin raised after we certified SI. So, lets 
ensure we have good coverage on this method. 
   
   Strictly speaking, I am interested in UTs for finding SI records for a given 
file slice. 
   you can even add private method for one file slice and add UTs for it.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to