rdblue commented on a change in pull request #600: Implement in and notIn in
multiple visitors
URL: https://github.com/apache/incubator-iceberg/pull/600#discussion_r355148988
##########
File path:
api/src/main/java/org/apache/iceberg/expressions/InclusiveMetricsEvaluator.java
##########
@@ -269,11 +270,38 @@ public Boolean or(Boolean leftResult, Boolean
rightResult) {
@Override
public <T> Boolean in(BoundReference<T> ref, Set<T> literalSet) {
+ Integer id = ref.fieldId();
+
+ if (containsNullsOnly(id)) {
+ return ROWS_CANNOT_MATCH;
+ }
+
+ final Comparator<T> comparator = ((BoundSetPredicate<T>)
expr).comparator();
+ Set<T> literals = literalSet;
+
+ if (lowerBounds != null && lowerBounds.containsKey(id)) {
+ T lower = Conversions.fromByteBuffer(ref.type(), lowerBounds.get(id));
+ literals = literals.stream().filter(v -> comparator.compare(lower, v)
<= 0).collect(Collectors.toSet());
Review comment:
There's no need to use a set to collect the values, and the set would not
correctly handle mixed UTF8 string types.
I think it would be better to use a list for the result. That would mean
that `literals` can't be reused, but I think it is fine to use a separate
`List<T>` variable for values that are greater than the lower bound.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]