codope commented on code in PR #12050:
URL: https://github.com/apache/hudi/pull/12050#discussion_r1798720913


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -1215,20 +1204,25 @@ private static Stream<HoodieRecord> 
getColumnStatsRecords(String partitionPath,
     }
 
     List<HoodieColumnRangeMetadata<Comparable>> columnRangeMetadata =
-        readColumnRangeMetadataFrom(filePartitionPath, datasetMetaClient, 
columnsToIndex);
+        readColumnRangeMetadataFrom(filePartitionPath, datasetMetaClient, 
columnsToIndex, false, Option.empty());
 
     return HoodieMetadataPayload.createColumnStatsRecords(partitionPath, 
columnRangeMetadata, false);
   }
 
   private static List<HoodieColumnRangeMetadata<Comparable>> 
readColumnRangeMetadataFrom(String filePath,
                                                                                
          HoodieTableMetaClient datasetMetaClient,
-                                                                               
          List<String> columnsToIndex) {
+                                                                               
          List<String> columnsToIndex,
+                                                                               
          boolean shouldReadColumnStatsForLogFiles,
+                                                                               
          Option<Schema> writerSchemaOpt) {
     try {
+      StoragePath fullFilePath = new 
StoragePath(datasetMetaClient.getBasePath(), filePath);
       if (filePath.endsWith(HoodieFileFormat.PARQUET.getFileExtension())) {
-        StoragePath fullFilePath = new 
StoragePath(datasetMetaClient.getBasePath(), filePath);
         return HoodieIOFactory.getIOFactory(datasetMetaClient.getStorage())
             .getFileFormatUtils(HoodieFileFormat.PARQUET)
             .readColumnStatsFromMetadata(datasetMetaClient.getStorage(), 
fullFilePath, columnsToIndex);
+      } else if (FSUtils.isLogFile(fullFilePath) && 
shouldReadColumnStatsForLogFiles) {
+        LOG.warn("Reading log file: {}, to build column range metadata.", 
fullFilePath);
+        return getLogFileColumnRangeMetadata(fullFilePath.toString(), 
datasetMetaClient, columnsToIndex, writerSchemaOpt);

Review Comment:
   Yeah, we are reusing it in case of updates. The boolean 
`shouldReadColumnStatsForLogFiles` is true only when initializing the partition 
stats. When updating subseqiently, it is set to false.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to