codope commented on code in PR #12081:
URL: https://github.com/apache/hudi/pull/12081#discussion_r1802472034


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -2174,21 +2174,14 @@ public static HoodieMetadataColumnStats 
mergeColumnStatsRecords(HoodieMetadataCo
       return newColumnStats;
     }
 
-    Comparable minValue =
-        (Comparable) Stream.of(
-                (Comparable) 
unwrapAvroValueWrapper(prevColumnStats.getMinValue()),
-                (Comparable) 
unwrapAvroValueWrapper(newColumnStats.getMinValue()))
-            .filter(Objects::nonNull)
-            .min(Comparator.naturalOrder())
-            .orElse(null);
-
-    Comparable maxValue =
-        (Comparable) Stream.of(
-                (Comparable) 
unwrapAvroValueWrapper(prevColumnStats.getMaxValue()),
-                (Comparable) 
unwrapAvroValueWrapper(newColumnStats.getMaxValue()))
-            .filter(Objects::nonNull)
-            .max(Comparator.naturalOrder())
-            .orElse(null);
+    Comparable minValue = castAndCompare(
+        unwrapAvroValueWrapper(newColumnStats.getMinValue()),
+        unwrapAvroValueWrapper(prevColumnStats.getMinValue()),
+        true);

Review Comment:
   +1 we have to be careful about this. Numerical promotions that retain the 
numerical nature of the data (like int to long, float, or double) and some 
well-structured type conversions (e.g., timestamp to string in ISO format) are 
typically order-preserving. However, transitions involving strings, especially 
when converting from numeric types, are often not order-preserving.
   
   This should be an issue even with column stats. To move forward, I would 
suggest to catch the cast exception and throw a meaningful error and ask user 
to disable column stats/partition stats to unblock ingestion. Wdyt @yihua 
@linliu-code @nsivabalan ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to