danny0405 commented on code in PR #13383:
URL: https://github.com/apache/hudi/pull/13383#discussion_r2128025159


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -1701,31 +1677,38 @@ private static 
List<HoodieColumnRangeMetadata<Comparable>> readColumnRangeMetada
    * Read column range metadata from log file.
    */
   @VisibleForTesting
-  public static List<HoodieColumnRangeMetadata<Comparable>> 
getLogFileColumnRangeMetadata(String filePath, HoodieTableMetaClient 
datasetMetaClient,
+  public static List<HoodieColumnRangeMetadata<Comparable>> 
getLogFileColumnRangeMetadata(String filePath, String partitionPath,
+                                                                               
           HoodieTableMetaClient datasetMetaClient,
                                                                                
           List<String> columnsToIndex, Option<Schema> writerSchemaOpt,
                                                                                
           int maxBufferSize) throws IOException {
     if (writerSchemaOpt.isPresent()) {
       List<Pair<String, Schema.Field>> fieldsToIndex = 
columnsToIndex.stream().map(fieldName -> 
HoodieAvroUtils.getSchemaForField(writerSchemaOpt.get(), fieldName))
           .collect(Collectors.toList());
       // read log file records without merging

Review Comment:
   > if the records are unique to avoid duplicates on the data. So maybe this 
is why? 
   
   For index like bucket index, log records are not ensured to be de-dupped, 
I'm in line with you to keep the unmerge behavior this is same with the master.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to