yihua commented on code in PR #12105:
URL: https://github.com/apache/hudi/pull/12105#discussion_r1818279313
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -970,18 +967,14 @@ public static HoodieData<HoodieRecord>
convertFilesToColumnStatsRecords(HoodieEn
final List<Tuple3<String, String, Boolean>> partitionFileFlagTupleList =
fetchPartitionFileInfoTriplets(partitionToDeletedFiles,
partitionToAppendedFiles);
// Create records MDT
+ Option<Schema> writerSchemaOpt = fetchStatsForLogFiles ?
tryResolveSchemaForTable(dataMetaClient) : Option.empty();
Review Comment:
The reason I asked about this is that the log blocks and files can be
written by different commits using different writer schemas in cas there is
schema evolution. Strictly speaking, the column stats of a log file should
only contain the columns that exist in the file.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]