codope commented on code in PR #12105:
URL: https://github.com/apache/hudi/pull/12105#discussion_r1803162951


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -753,7 +752,7 @@ public static HoodieData<HoodieRecord> 
convertMetadataToColumnStatsRecords(Hoodi
           String partitionPath = deleteFileInfoPair.getLeft();
           String filePath = deleteFileInfoPair.getRight();
 
-          if (filePath.endsWith(HoodieFileFormat.PARQUET.getFileExtension()) 
|| ExternalFilePathUtil.isExternallyCreatedFile(filePath)) {
+          if (ExternalFilePathUtil.isExternallyCreatedFile(filePath)) {

Review Comment:
   This change is causing `TestExternalPathHandling.testFlow` to fail as the 
stats are excluded for replaced parquet file. May I know what's the intention 
behind this change?
   
   I guess the intention might be that any files cleaned (data files as well as 
log files) should have the entries removed from col stats right? But, then 
should we just change the condition to be something like: `if 
(filePath.endsWith(HoodieFileFormat.PARQUET.getFileExtension()) || 
FSUtils.isLogFile(filePath) || 
ExternalFilePathUtil.isExternallyCreatedFile(filePath))`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to