rdblue commented on a change in pull request #91: Ignore unsupported partition 
fields
URL: https://github.com/apache/incubator-iceberg/pull/91#discussion_r254440184
 
 

 ##########
 File path: api/src/main/java/com/netflix/iceberg/transforms/Transforms.java
 ##########
 @@ -41,27 +42,27 @@ private Transforms() {
 
   private static final Pattern HAS_WIDTH = 
Pattern.compile("(\\w+)\\[(\\d+)\\]");
 
-  public static Transform<?, ?> fromString(Type type, String transform) {
+  public static Optional<Transform<?, ?>> fromString(Type type, String 
transform) {
     Matcher width = HAS_WIDTH.matcher(transform);
     if (width.matches()) {
       String name = width.group(1);
       int w = Integer.parseInt(width.group(2));
       if (name.equalsIgnoreCase("truncate")) {
-        return Truncate.get(type, w);
+        return Optional.of(Truncate.get(type, w));
       } else if (name.equals("bucket")) {
-        return Bucket.get(type, w);
+        return Optional.of(Bucket.get(type, w));
       }
     }
 
     if (transform.equalsIgnoreCase("identity")) {
-      return Identity.get(type);
+      return Optional.of(Identity.get(type));
     } else if (type.typeId() == Type.TypeID.TIMESTAMP) {
-      return Timestamps.valueOf(transform.toUpperCase(Locale.ENGLISH));
+      return 
Optional.of(Timestamps.valueOf(transform.toUpperCase(Locale.ENGLISH)));
     } else if (type.typeId() == Type.TypeID.DATE) {
-      return Dates.valueOf(transform.toUpperCase(Locale.ENGLISH));
+      return Optional.of(Dates.valueOf(transform.toUpperCase(Locale.ENGLISH)));
     }
 
-    throw new IllegalArgumentException("Unknown transform: " + transform);
+    return Optional.empty();
 
 Review comment:
   We don't use optional very much (if at all) in the Iceberg API. It is 
usually better to return something that behaves correctly and doesn't require 
the caller to handle a new case, like returning `ImmutableList.of()` for a list 
instead of `Optional.empty()`. When the caller should handle a case, we usually 
return `null` and document it to avoid instantiating extra objects. Both of 
these options don't require making type changes to the API.
   
   In this case, I think the first option is a better implementation. This 
should return an `UnknownTransform` implementation. If a transform is not 
recognized, Iceberg can still read data, but cannot write data for that 
partition spec because it doesn't know how to derive a partition value using 
the transform. Returning `Optional.empty` here and ignoring unknown transforms 
in `PartitionSpec` would not signal that the partition spec can't be used to 
write. An older writer may try to add files and compact them with a manifest 
from a newer writer, which would cause problems.
   
   An `UnknownTransform` implementation should throw an exception in `apply`. 
It should also return `Expressions.alwaysTrue` for `project` (all partition 
values may contain matching data) and `null` for `projectStrict` (no guarantees 
about matching a predicate).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to