rdblue commented on a change in pull request #91: Ignore unsupported partition
fields
URL: https://github.com/apache/incubator-iceberg/pull/91#discussion_r254440184
##########
File path: api/src/main/java/com/netflix/iceberg/transforms/Transforms.java
##########
@@ -41,27 +42,27 @@ private Transforms() {
private static final Pattern HAS_WIDTH =
Pattern.compile("(\\w+)\\[(\\d+)\\]");
- public static Transform<?, ?> fromString(Type type, String transform) {
+ public static Optional<Transform<?, ?>> fromString(Type type, String
transform) {
Matcher width = HAS_WIDTH.matcher(transform);
if (width.matches()) {
String name = width.group(1);
int w = Integer.parseInt(width.group(2));
if (name.equalsIgnoreCase("truncate")) {
- return Truncate.get(type, w);
+ return Optional.of(Truncate.get(type, w));
} else if (name.equals("bucket")) {
- return Bucket.get(type, w);
+ return Optional.of(Bucket.get(type, w));
}
}
if (transform.equalsIgnoreCase("identity")) {
- return Identity.get(type);
+ return Optional.of(Identity.get(type));
} else if (type.typeId() == Type.TypeID.TIMESTAMP) {
- return Timestamps.valueOf(transform.toUpperCase(Locale.ENGLISH));
+ return
Optional.of(Timestamps.valueOf(transform.toUpperCase(Locale.ENGLISH)));
} else if (type.typeId() == Type.TypeID.DATE) {
- return Dates.valueOf(transform.toUpperCase(Locale.ENGLISH));
+ return Optional.of(Dates.valueOf(transform.toUpperCase(Locale.ENGLISH)));
}
- throw new IllegalArgumentException("Unknown transform: " + transform);
+ return Optional.empty();
Review comment:
We don't use optional very much (if at all) in the Iceberg API. It is
usually better to return something that behaves correctly and doesn't require
the caller to handle a new case, like returning `ImmutableList.of()` for a list
instead of `Optional.empty()`. When the caller should handle a case, we usually
return `null` and document it to avoid instantiating extra objects. Both of
these options don't require making type changes to the API.
In this case, I think the first option is a better implementation. This
should return an `UnknownTransform` implementation. If a transform is not
recognized, Iceberg can still read data, but cannot write data for that
partition spec because it doesn't know how to derive a partition value using
the transform. Returning `Optional.empty` here and ignoring unknown transforms
in `PartitionSpec` would not signal that the partition spec can't be used to
write. An older writer may try to add files and compact them with a manifest
from a newer writer, which would cause problems.
An `UnknownTransform` implementation should throw an exception in `apply`.
It should also return `Expressions.alwaysTrue` for `project` (all partition
values may contain matching data) and `null` for `projectStrict` (no guarantees
about matching a predicate).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]