yihua commented on code in PR #13711:
URL: https://github.com/apache/hudi/pull/13711#discussion_r2353665825
##########
hudi-common/src/main/java/org/apache/hudi/common/util/FileFormatUtils.java:
##########
@@ -64,10 +66,24 @@ public abstract class FileFormatUtils {
* @param fileColumnRanges List of column range statistics for each file in
a partition
*/
public static <T extends Comparable<T>> HoodieColumnRangeMetadata<T>
getColumnRangeInPartition(String relativePartitionPath,
+
String columnName,
@Nonnull List<HoodieColumnRangeMetadata<T>> fileColumnRanges,
-
Map<String, Schema> colsToIndexSchemaMap) {
+
Map<String, Schema> colsToIndexSchemaMap,
+
HoodieIndexVersion indexVersion) {
ValidationUtils.checkArgument(!fileColumnRanges.isEmpty(),
"fileColumnRanges should not be empty.");
+ if (indexVersion.greaterThanOrEquals(HoodieIndexVersion.V2)) {
+ ValueMetadata valueMetadata =
ValueMetadata.getValueMetadata(colsToIndexSchemaMap.get(columnName),
indexVersion);
+ return fileColumnRanges.stream()
+ .map(e -> {
+ T minValue = (T)
valueMetadata.standardizeJavaTypeAndPromote(e.getMinValue());
+ T maxValue = (T)
valueMetadata.standardizeJavaTypeAndPromote(e.getMaxValue());
+ return HoodieColumnRangeMetadata.create(
+ relativePartitionPath, e.getColumnName(), minValue, maxValue,
e.getNullCount(), e.getValueCount(), e.getTotalSize(),
+ e.getTotalUncompressedSize(), valueMetadata);
+ }).reduce(HoodieColumnRangeMetadata::merge).orElseThrow(() -> new
HoodieException("MergingColumnRanges failed."));
+ }
+ // we are reducing using merge so IDK why we think there are multiple cols
that need to go through schema evolution
Review Comment:
AFAIK the merging of column stats should only happen for partition stats,
i.e., merging the stats of multiple files in a partition. Some type promotions
in schema evolution should not happen on min and max values, e.g., int ->
string, where the ordering can be different, e.g., 2 compared to 11, vs "2"
compared to "11".
##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroWrapperUtils.java:
##########
@@ -238,4 +234,120 @@ private static Pair<Boolean, String>
getIsValueWrapperObfuscated(Object statsVal
}
return Pair.of(false, null);
}
+
+ public enum PrimitiveWrapperType {
+ V1(Object.class, HoodieAvroWrapperUtils::wrapValueIntoAvro,
HoodieAvroWrapperUtils::unwrapAvroValueWrapper, GenericRecord.class),
+ NULL(Void.class, HoodieAvroWrapperUtils::wrapNull,
HoodieAvroWrapperUtils::unwrapNull, Void.class),
Review Comment:
Is this used? My understanding is that the min and max values do not
consider nulls.
##########
hudi-common/src/main/java/org/apache/hudi/common/util/DateTimeUtils.java:
##########
@@ -71,6 +77,41 @@ public static long instantToMicros(Instant instant) {
}
}
+ public static long instantToNanos(Instant instant) {
+ long seconds = instant.getEpochSecond();
+ int nanos = instant.getNano();
+
+ if (seconds < 0 && nanos > 0) {
+ // Shift seconds by +1, then subtract a full second in nanos
+ long totalNanos = Math.multiplyExact(seconds + 1, 1_000_000_000L);
+ long adjustment = nanos - 1_000_000_000L;
+ return Math.addExact(totalNanos, adjustment);
+ } else {
+ long totalNanos = Math.multiplyExact(seconds, 1_000_000_000L);
+ return Math.addExact(totalNanos, nanos);
+ }
+ }
Review Comment:
Could we add a few unit tests on these new date and time utils?
##########
hudi-common/src/main/java/org/apache/hudi/common/util/DateTimeUtils.java:
##########
@@ -71,6 +77,41 @@ public static long instantToMicros(Instant instant) {
}
}
+ public static long instantToNanos(Instant instant) {
+ long seconds = instant.getEpochSecond();
+ int nanos = instant.getNano();
+
+ if (seconds < 0 && nanos > 0) {
+ // Shift seconds by +1, then subtract a full second in nanos
+ long totalNanos = Math.multiplyExact(seconds + 1, 1_000_000_000L);
+ long adjustment = nanos - 1_000_000_000L;
+ return Math.addExact(totalNanos, adjustment);
+ } else {
+ long totalNanos = Math.multiplyExact(seconds, 1_000_000_000L);
+ return Math.addExact(totalNanos, nanos);
+ }
+ }
Review Comment:
Are these standard utils copied from somewhere?
##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroWrapperUtils.java:
##########
@@ -238,4 +234,120 @@ private static Pair<Boolean, String>
getIsValueWrapperObfuscated(Object statsVal
}
return Pair.of(false, null);
}
+
+ public enum PrimitiveWrapperType {
+ V1(Object.class, HoodieAvroWrapperUtils::wrapValueIntoAvro,
HoodieAvroWrapperUtils::unwrapAvroValueWrapper, GenericRecord.class),
+ NULL(Void.class, HoodieAvroWrapperUtils::wrapNull,
HoodieAvroWrapperUtils::unwrapNull, Void.class),
+ BOOLEAN(Boolean.class, HoodieAvroWrapperUtils::wrapBoolean,
HoodieAvroWrapperUtils::unwrapBoolean, BooleanWrapper.class),
+ INT(Integer.class, HoodieAvroWrapperUtils::wrapInt,
HoodieAvroWrapperUtils::unwrapInt, IntWrapper.class),
+ LONG(Long.class, HoodieAvroWrapperUtils::wrapLong,
HoodieAvroWrapperUtils::unwrapLong, LongWrapper.class),
+ FLOAT(Float.class, HoodieAvroWrapperUtils::wrapFloat,
HoodieAvroWrapperUtils::unwrapFloat, FloatWrapper.class),
+ DOUBLE(Double.class, HoodieAvroWrapperUtils::wrapDouble,
HoodieAvroWrapperUtils::unwrapDouble, DoubleWrapper.class),
+ STRING(String.class, HoodieAvroWrapperUtils::wrapString,
HoodieAvroWrapperUtils::unwrapString, StringWrapper.class),
+ BYTES(ByteBuffer.class, HoodieAvroWrapperUtils::wrapBytes,
HoodieAvroWrapperUtils::unwrapBytes, BytesWrapper.class);
+
+ private final Class<?> clazz;
+ private final Function<Comparable<?>, Object> wrapper;
+ private final Function<Object, Comparable<?>> unwrapper;
+ private final Class<?> wrapperClass;
+
+ PrimitiveWrapperType(Class<?> clazz, Function<Comparable<?>, Object>
wrapper, Function<Object, Comparable<?>> unwrapper, Class<?> wrapperClass) {
+ this.clazz = clazz;
+ this.wrapper = wrapper;
+ this.unwrapper = unwrapper;
+ this.wrapperClass = wrapperClass;
+ }
+
+ public Class<?> getClazz() {
+ return clazz;
+ }
+
+ public Object wrap(Comparable<?> value) {
+ return wrapper.apply(value);
+ }
+
+ public Comparable<?> unwrap(Object value) {
+ return unwrapper.apply(value);
+ }
+
+ public Class<?> getWrapperClass() {
+ return wrapperClass;
+ }
+ }
+
+ private static Object wrapNull(Comparable<?> value) {
+ return value;
Review Comment:
So is `value` `null` here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]