nsivabalan commented on code in PR #12714:
URL: https://github.com/apache/hudi/pull/12714#discussion_r1930058078


##########
hudi-common/src/main/java/org/apache/hudi/common/util/FileFormatUtils.java:
##########
@@ -59,14 +60,38 @@ public abstract class FileFormatUtils {
    * @param fileColumnRanges List of column range statistics for each file in 
a partition
    */
   public static <T extends Comparable<T>> HoodieColumnRangeMetadata<T> 
getColumnRangeInPartition(String relativePartitionPath,
-                                                                               
                  @Nonnull List<HoodieColumnRangeMetadata<T>> fileColumnRanges) 
{
+                                                                               
                  @Nonnull List<HoodieColumnRangeMetadata<T>> fileColumnRanges,
+                                                                               
                  Map<String, Schema> colsToIndexSchemaMap) {
     ValidationUtils.checkArgument(!fileColumnRanges.isEmpty(), 
"fileColumnRanges should not be empty.");
     // There are multiple files. Compute min(file_mins) and max(file_maxs)
     return fileColumnRanges.stream()
         .map(e -> HoodieColumnRangeMetadata.create(
             relativePartitionPath, e.getColumnName(), e.getMinValue(), 
e.getMaxValue(),
             e.getNullCount(), e.getValueCount(), e.getTotalSize(), 
e.getTotalUncompressedSize()))
-        .reduce(HoodieColumnRangeMetadata::merge).orElseThrow(() -> new 
HoodieException("MergingColumnRanges failed."));
+        .reduce((a,b) -> {
+          try {
+            if (colsToIndexSchemaMap.isEmpty() || (a.getMinValue() == null || 
b.getMinValue() == null)

Review Comment:
   if min value is not null, max value also will not be null. 
   i.e. if there is one non null value, both min and max will be set to that 
value. So, there can't be a scenario where min is not null, but max value is 
null 



##########
hudi-common/src/main/java/org/apache/hudi/common/util/FileFormatUtils.java:
##########
@@ -59,14 +60,38 @@ public abstract class FileFormatUtils {
    * @param fileColumnRanges List of column range statistics for each file in 
a partition
    */
   public static <T extends Comparable<T>> HoodieColumnRangeMetadata<T> 
getColumnRangeInPartition(String relativePartitionPath,
-                                                                               
                  @Nonnull List<HoodieColumnRangeMetadata<T>> fileColumnRanges) 
{
+                                                                               
                  @Nonnull List<HoodieColumnRangeMetadata<T>> fileColumnRanges,
+                                                                               
                  Map<String, Schema> colsToIndexSchemaMap) {
     ValidationUtils.checkArgument(!fileColumnRanges.isEmpty(), 
"fileColumnRanges should not be empty.");
     // There are multiple files. Compute min(file_mins) and max(file_maxs)
     return fileColumnRanges.stream()
         .map(e -> HoodieColumnRangeMetadata.create(
             relativePartitionPath, e.getColumnName(), e.getMinValue(), 
e.getMaxValue(),
             e.getNullCount(), e.getValueCount(), e.getTotalSize(), 
e.getTotalUncompressedSize()))
-        .reduce(HoodieColumnRangeMetadata::merge).orElseThrow(() -> new 
HoodieException("MergingColumnRanges failed."));
+        .reduce((a,b) -> {
+          try {
+            if (colsToIndexSchemaMap.isEmpty() || (a.getMinValue() == null || 
b.getMinValue() == null)
+                || 
(a.getMinValue().getClass().getSimpleName().equals(b.getMinValue().getClass().getSimpleName())
+                && 
a.getMaxValue().getClass().getSimpleName().equals(b.getMaxValue().getClass().getSimpleName())))
 {

Review Comment:
   sure.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to