rdblue commented on code in PR #12311:
URL: https://github.com/apache/iceberg/pull/12311#discussion_r1970487043


##########
api/src/main/java/org/apache/iceberg/variants/VariantUtil.java:
##########
@@ -18,21 +18,112 @@
  */
 package org.apache.iceberg.variants;
 
+import java.math.BigDecimal;
 import java.nio.ByteBuffer;
 import java.nio.ByteOrder;
 import java.nio.charset.StandardCharsets;
+import java.util.Map;
 import java.util.function.Function;
 import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.types.Type;
+import org.apache.iceberg.types.Types;
 
-class VariantUtil {
+public class VariantUtil {
   private static final int BASIC_TYPE_MASK = 0b11;
   private static final int BASIC_TYPE_PRIMITIVE = 0;
   private static final int BASIC_TYPE_SHORT_STRING = 1;
   private static final int BASIC_TYPE_OBJECT = 2;
   private static final int BASIC_TYPE_ARRAY = 3;
 
+  // TODO: Implement PhysicalType.TIME
+  // TODO: Implement PhysicalType.TIMESTAMPNTZ_NANO and 
PhysicalType.TIMESTAMPTZ_NANO
+  // TODO: Implement PhysicalType.UUID
+  private static final Map<Type, PhysicalType> NO_CONVERSION_NEEDED =
+      ImmutableMap.<Type, PhysicalType>builder()
+          .put(Types.IntegerType.get(), PhysicalType.INT32)
+          .put(Types.LongType.get(), PhysicalType.INT64)
+          .put(Types.FloatType.get(), PhysicalType.FLOAT)
+          .put(Types.DoubleType.get(), PhysicalType.DOUBLE)
+          .put(Types.DateType.get(), PhysicalType.DATE)
+          .put(Types.TimestampType.withoutZone(), PhysicalType.TIMESTAMPNTZ)
+          .put(Types.TimestampType.withZone(), PhysicalType.TIMESTAMPTZ)
+          .put(Types.StringType.get(), PhysicalType.STRING)
+          .put(Types.BinaryType.get(), PhysicalType.BINARY)
+          .put(Types.UnknownType.get(), PhysicalType.NULL)
+          .build();
+
   private VariantUtil() {}
 
+  @SuppressWarnings("unchecked")
+  public static <T> T castTo(VariantValue value, Type type) {

Review Comment:
   I don't think that we should expose this on `VariantValue` and in fact I 
almost moved it into `InclusiveMetricsEvaluator` because the behavior here is a 
bit misleading. The problem is that we don't want to make callers think that 
this conversion is equivalent to any SQL operation because SQL engines will 
have a lot more complex logic.
   
   For instance, a SQL engine might have an expression like 
`CAST(extract(var_col, "$.field") AS INT) < 10`. Iceberg does something 
similar, but the cast expression in SQL is much more general and may handle 
cases like variant string to int. We want to avoid a false equivalence with SQL 
behavior, which is up to the engine. This is why the `Extract.eval` method is 
not implemented.
   
   I think the right thing is actually to make this less public and move it 
into a util class in expressions so that those classes can use it, but we don't 
expose it more broadly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to