codope commented on code in PR #12105:
URL: https://github.com/apache/hudi/pull/12105#discussion_r1803162951
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -753,7 +752,7 @@ public static HoodieData<HoodieRecord>
convertMetadataToColumnStatsRecords(Hoodi
String partitionPath = deleteFileInfoPair.getLeft();
String filePath = deleteFileInfoPair.getRight();
- if (filePath.endsWith(HoodieFileFormat.PARQUET.getFileExtension())
|| ExternalFilePathUtil.isExternallyCreatedFile(filePath)) {
+ if (ExternalFilePathUtil.isExternallyCreatedFile(filePath)) {
Review Comment:
This change is causing `TestExternalPathHandling.testFlow` to fail as the
stats are excluded for replaced parquet file. May I know what's the intention
behind this change?
I guess the intention might be that any files cleaned (data files as well as
log files) should have the entries removed from col stats right? But, then
should we just change the condition to be something like: `if
(filePath.endsWith(HoodieFileFormat.PARQUET.getFileExtension()) ||
FSUtils.isLogFile(filePath) ||
ExternalFilePathUtil.isExternallyCreatedFile(filePath))`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]