rdblue commented on a change in pull request #600: Implement in and notIn in
multiple visitors
URL: https://github.com/apache/incubator-iceberg/pull/600#discussion_r355149471
##########
File path:
api/src/main/java/org/apache/iceberg/expressions/StrictMetricsEvaluator.java
##########
@@ -308,11 +310,67 @@ public Boolean or(Boolean leftResult, Boolean
rightResult) {
@Override
public <T> Boolean in(BoundReference<T> ref, Set<T> literalSet) {
+ Integer id = ref.fieldId();
+ Types.NestedField field = struct.field(id);
+ Preconditions.checkNotNull(field, "Cannot filter by nested column: %s",
schema.findField(id));
+
+ if (canContainNulls(id)) {
+ return ROWS_MIGHT_NOT_MATCH;
+ }
+
+ if (lowerBounds != null && lowerBounds.containsKey(id) &&
+ upperBounds != null && upperBounds.containsKey(id)) {
+ final Comparator<T> comparator = ((BoundSetPredicate<T>)
expr).comparator();
+ Set<T> literals = literalSet;
+
+ T lower = Conversions.fromByteBuffer(struct.field(id).type(),
lowerBounds.get(id));
+ literals = literals.stream().filter(v -> comparator.compare(lower, v)
== 0).collect(Collectors.toSet());
+ if (literals.isEmpty()) {
+ return ROWS_MIGHT_NOT_MATCH;
Review comment:
I think this logic can be simplified to use the set. It should check if
`lower` is in the set, check if `upper` is in the set, and finally check if
`lower` and `upper` are equal. All values must be in the set if those three
conditions are met. Can you update the logic here and add some comments to
explain the implementation?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]