yihua commented on code in PR #12081:
URL: https://github.com/apache/hudi/pull/12081#discussion_r1801788952


##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/TestHoodieTimelineArchiver.java:
##########
@@ -243,6 +243,7 @@ private HoodieWriteConfig 
initTestTableAndGetWriteConfig(boolean enableMetadata,
         .withFileSystemViewConfig(FileSystemViewStorageConfig.newBuilder()
             .withRemoteServerPort(timelineServicePort).build())
         
.withMetadataConfig(HoodieMetadataConfig.newBuilder().enable(enableMetadata)
+            .withMetadataIndexPartitionStats(false)

Review Comment:
   Could we have a config to turn off MDT as a whole instead of turning off 
each MDT index individually?



##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -2174,21 +2174,14 @@ public static HoodieMetadataColumnStats 
mergeColumnStatsRecords(HoodieMetadataCo
       return newColumnStats;
     }
 
-    Comparable minValue =
-        (Comparable) Stream.of(
-                (Comparable) 
unwrapAvroValueWrapper(prevColumnStats.getMinValue()),
-                (Comparable) 
unwrapAvroValueWrapper(newColumnStats.getMinValue()))
-            .filter(Objects::nonNull)
-            .min(Comparator.naturalOrder())
-            .orElse(null);
-
-    Comparable maxValue =
-        (Comparable) Stream.of(
-                (Comparable) 
unwrapAvroValueWrapper(prevColumnStats.getMaxValue()),
-                (Comparable) 
unwrapAvroValueWrapper(newColumnStats.getMaxValue()))
-            .filter(Objects::nonNull)
-            .max(Comparator.naturalOrder())
-            .orElse(null);
+    Comparable minValue = castAndCompare(
+        unwrapAvroValueWrapper(newColumnStats.getMinValue()),
+        unwrapAvroValueWrapper(prevColumnStats.getMinValue()),
+        true);

Review Comment:
   We need to think more about type promotion since the column and partition 
stats may need to be regenerated in this case for correctness.  For example, 
take a list of values of `[1,2,3,4,5,6,7,8,9,10]`, when promoting the type from 
integer to String, the min-max range is changed from `[1,10]` to `["1", "9"]`, 
which cannot be done through simple type cast.  I think it's better to avoid 
the type cast here and handle the stats merge somewhere else to avoid possible 
corruption of stats.



##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -2174,21 +2174,14 @@ public static HoodieMetadataColumnStats 
mergeColumnStatsRecords(HoodieMetadataCo
       return newColumnStats;
     }
 
-    Comparable minValue =
-        (Comparable) Stream.of(
-                (Comparable) 
unwrapAvroValueWrapper(prevColumnStats.getMinValue()),
-                (Comparable) 
unwrapAvroValueWrapper(newColumnStats.getMinValue()))
-            .filter(Objects::nonNull)
-            .min(Comparator.naturalOrder())
-            .orElse(null);
-
-    Comparable maxValue =
-        (Comparable) Stream.of(
-                (Comparable) 
unwrapAvroValueWrapper(prevColumnStats.getMaxValue()),
-                (Comparable) 
unwrapAvroValueWrapper(newColumnStats.getMaxValue()))
-            .filter(Objects::nonNull)
-            .max(Comparator.naturalOrder())
-            .orElse(null);
+    Comparable minValue = castAndCompare(
+        unwrapAvroValueWrapper(newColumnStats.getMinValue()),
+        unwrapAvroValueWrapper(prevColumnStats.getMinValue()),
+        true);

Review Comment:
   Is this for supporting schema evolution?



##########
hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java:
##########
@@ -196,8 +196,8 @@ private List<String> 
getPartitionPathWithPathPrefixUsingFilterExpression(String
                     } else if 
(!path.getName().equals(HoodieTableMetaClient.METAFOLDER_NAME)) {
                       return Pair.of(Option.empty(), Option.of(path));
                     }
-                  } else if (path.getName()
-                      
.startsWith(HoodiePartitionMetadata.HOODIE_PARTITION_METAFILE_PREFIX)) {
+                  } else if 
(path.getName().startsWith(HoodiePartitionMetadata.HOODIE_PARTITION_METAFILE_PREFIX)
+                      || 
MetadataPartitionType.isValidPartitionFilePath(path.getName())) {

Review Comment:
   Let's not add MDT specific logic in this method, which should not be aware 
of whether the table is a data table or metadata table.  Could we fix the the 
issue of writing partition metadata file in MDT directly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to