yihua commented on code in PR #12105:
URL: https://github.com/apache/hudi/pull/12105#discussion_r1818279313


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -970,18 +967,14 @@ public static HoodieData<HoodieRecord> 
convertFilesToColumnStatsRecords(HoodieEn
     final List<Tuple3<String, String, Boolean>> partitionFileFlagTupleList = 
fetchPartitionFileInfoTriplets(partitionToDeletedFiles, 
partitionToAppendedFiles);
 
     // Create records MDT
+    Option<Schema> writerSchemaOpt = fetchStatsForLogFiles ? 
tryResolveSchemaForTable(dataMetaClient) : Option.empty();

Review Comment:
   The reason I asked about this is that the log blocks and files can be 
written by different commits using different writer schemas in cas there is 
schema evolution.  Strictly speaking, the column stats of a log file should 
only contain the columns that exist in the file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to