[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318403061
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/BoundSetPredicate.java
 ##
 @@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.Set;
+import org.apache.iceberg.util.CharSequenceWrapper;
+
+public class BoundSetPredicate extends Predicate> {
+  private final LiteralSet literalSet;
+
+  BoundSetPredicate(Operation op, BoundReference ref, Set> lits) 
{
+super(op, ref);
+Preconditions.checkArgument(op == Operation.IN || op == Operation.NOT_IN,
+"%s predicate does not support a set of literals", op);
+this.literalSet = new LiteralSet<>(lits);
+  }
+
+  BoundSetPredicate(Operation op, BoundReference ref, LiteralSet lits) {
+super(op, ref);
+Preconditions.checkArgument(op == Operation.IN || op == Operation.NOT_IN,
+"%s predicate does not support a literal set", op);
+this.literalSet = lits;
+  }
+
+  @Override
+  public Expression negate() {
+return new BoundSetPredicate<>(op().negate(), ref(), literalSet);
+  }
+
+  public Set literalSet() {
+return literalSet;
+  }
+
+  @Override
+  String literalString() {
+return literalSet.toString();
+  }
+
+  /**
+   * Represents a set of literal values in an IN or NOT_IN predicate
+   * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+   */
+  private static class LiteralSet implements Set, Serializable {
+private final Set values;
+
+@SuppressWarnings("unchecked")
+LiteralSet(Set> lits) {
+  Preconditions.checkArgument(lits == null || lits.size() > 1,
+  "The input literal set must include more than 1 element.");
+  values = ImmutableSet.builder().addAll(
+  lits.stream().map(
+  lit -> {
+if (lit instanceof Literals.StringLiteral) {
+  return (T) 
CharSequenceWrapper.wrap(((Literals.StringLiteral) lit).value());
+} else {
+  return lit.value();
+}
+  }
+  ).iterator()).build();
+}
+
+@Override
+public String toString() {
+  return Joiner.on(", ").join(values);
+}
+
+@Override
+public boolean contains(Object object) {
 
 Review comment:
   Thanks for the comments. Updated.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402782
 
 

 ##
 File path: api/src/test/java/org/apache/iceberg/expressions/TestEvaluator.java
 ##
 @@ -362,4 +368,197 @@ public void testCharSeqValue() {
 Assert.assertFalse("string(abc) == utf8(abcd) => false",
 evaluator.eval(TestHelpers.Row.of(new Utf8("abcd";
   }
+
+  @Test
+  public void testIn() {
+Assert.assertEquals(3, in("s", 7, 8, 9).literals().size());
+Assert.assertEquals(3, in("s", 7, 8.1, Long.MAX_VALUE).literals().size());
+Assert.assertEquals(2, in("s", "abc", "abd", "abc").literals().size());
+Assert.assertEquals(0, in("s").literals().size());
+Assert.assertEquals(1, in("s", 5).literals().size());
+Assert.assertEquals(1, in("s", 5, 5).literals().size());
+Assert.assertEquals(1, in("s", Arrays.asList(5, 5)).literals().size());
+Assert.assertEquals(0, in("s", Collections.emptyList()).literals().size());
+
+Evaluator evaluator = new Evaluator(STRUCT, in("x", 7, 8, Long.MAX_VALUE));
+Assert.assertTrue("7 in [7, 8] => true", 
evaluator.eval(TestHelpers.Row.of(7, 8, null)));
+Assert.assertFalse("9 in [7, 8]  => false", 
evaluator.eval(TestHelpers.Row.of(9, 8, null)));
+
+Evaluator longEvaluator = new Evaluator(STRUCT,
+in("x", Long.MAX_VALUE, Integer.MAX_VALUE, Long.MIN_VALUE));
+Assert.assertTrue("Integer.MAX_VALUE in [Integer.MAX_VALUE] => true",
+longEvaluator.eval(TestHelpers.Row.of(Integer.MAX_VALUE, 7.0, null)));
+Assert.assertFalse("6 in [Integer.MAX_VALUE]  => false",
+longEvaluator.eval(TestHelpers.Row.of(6, 6.8, null)));
+
+Evaluator integerEvaluator = new Evaluator(STRUCT, in("y", 7, 8, 9.1));
+Assert.assertTrue("7.0 in [7, 8, 9.1] => true", 
integerEvaluator.eval(TestHelpers.Row.of(7, 7.0, null)));
+Assert.assertTrue("9.1 in [7, 8, 9.1] => true", 
integerEvaluator.eval(TestHelpers.Row.of(7, 9.1, null)));
+Assert.assertFalse("6.8 in [7, 8, 9]  => false", 
integerEvaluator.eval(TestHelpers.Row.of(6, 6.8, null)));
+
+Evaluator structEvaluator = new Evaluator(STRUCT, in("s1.s2.s3.s4.i", 7, 
8, 9));
+Assert.assertTrue("7 in [7, 8, 9] => true",
+structEvaluator.eval(TestHelpers.Row.of(7, 8, null,
+TestHelpers.Row.of(
+TestHelpers.Row.of(
+TestHelpers.Row.of(
+TestHelpers.Row.of(7)));
+Assert.assertFalse("6 in [7, 8, 9]  => false",
+structEvaluator.eval(TestHelpers.Row.of(6, 8, null,
+TestHelpers.Row.of(
+TestHelpers.Row.of(
+TestHelpers.Row.of(
+TestHelpers.Row.of(6)));
+
+StructType charSeqStruct = StructType.of(required(34, "s", 
Types.StringType.get()));
+Evaluator charSeqEvaluator = new Evaluator(charSeqStruct, in("s", "abc", 
"abd", "abc"));
+Assert.assertTrue("utf8(abc) in [string(abc), string(abd)] => true",
+charSeqEvaluator.eval(TestHelpers.Row.of(new Utf8("abc";
+Assert.assertFalse("utf8(abcd) in [string(abc), string(abd)] => false",
+charSeqEvaluator.eval(TestHelpers.Row.of(new Utf8("abcd";
+  }
+
+  @Test
+  public void testInExceptions() {
+TestHelpers.assertThrows(
+"Throw exception if value is null",
+NullPointerException.class,
+"Cannot create expression literal from null",
+() -> in("x", (Literal) null));
+
+TestHelpers.assertThrows(
+"Throw exception if value is null",
+NullPointerException.class,
+"Values cannot be null for IN predicate",
+() -> in("x", (Collection) null));
+
+TestHelpers.assertThrows(
+"Throw exception if calling literals() for EQ predicate",
+IllegalArgumentException.class,
+"EQ predicate cannot return a list of literals",
+() -> equal("x", 5).literals());
 
 Review comment:
    Updated accordingly.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402818
 
 

 ##
 File path: api/src/test/java/org/apache/iceberg/expressions/TestEvaluator.java
 ##
 @@ -362,4 +368,197 @@ public void testCharSeqValue() {
 Assert.assertFalse("string(abc) == utf8(abcd) => false",
 evaluator.eval(TestHelpers.Row.of(new Utf8("abcd";
   }
+
+  @Test
+  public void testIn() {
+Assert.assertEquals(3, in("s", 7, 8, 9).literals().size());
+Assert.assertEquals(3, in("s", 7, 8.1, Long.MAX_VALUE).literals().size());
+Assert.assertEquals(2, in("s", "abc", "abd", "abc").literals().size());
+Assert.assertEquals(0, in("s").literals().size());
+Assert.assertEquals(1, in("s", 5).literals().size());
+Assert.assertEquals(1, in("s", 5, 5).literals().size());
+Assert.assertEquals(1, in("s", Arrays.asList(5, 5)).literals().size());
+Assert.assertEquals(0, in("s", Collections.emptyList()).literals().size());
+
+Evaluator evaluator = new Evaluator(STRUCT, in("x", 7, 8, Long.MAX_VALUE));
+Assert.assertTrue("7 in [7, 8] => true", 
evaluator.eval(TestHelpers.Row.of(7, 8, null)));
+Assert.assertFalse("9 in [7, 8]  => false", 
evaluator.eval(TestHelpers.Row.of(9, 8, null)));
+
+Evaluator longEvaluator = new Evaluator(STRUCT,
+in("x", Long.MAX_VALUE, Integer.MAX_VALUE, Long.MIN_VALUE));
+Assert.assertTrue("Integer.MAX_VALUE in [Integer.MAX_VALUE] => true",
+longEvaluator.eval(TestHelpers.Row.of(Integer.MAX_VALUE, 7.0, null)));
+Assert.assertFalse("6 in [Integer.MAX_VALUE]  => false",
+longEvaluator.eval(TestHelpers.Row.of(6, 6.8, null)));
+
+Evaluator integerEvaluator = new Evaluator(STRUCT, in("y", 7, 8, 9.1));
+Assert.assertTrue("7.0 in [7, 8, 9.1] => true", 
integerEvaluator.eval(TestHelpers.Row.of(7, 7.0, null)));
+Assert.assertTrue("9.1 in [7, 8, 9.1] => true", 
integerEvaluator.eval(TestHelpers.Row.of(7, 9.1, null)));
+Assert.assertFalse("6.8 in [7, 8, 9]  => false", 
integerEvaluator.eval(TestHelpers.Row.of(6, 6.8, null)));
 
 Review comment:
   Fixed.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402850
 
 

 ##
 File path: 
parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java
 ##
 @@ -19,19 +19,14 @@
 
 package org.apache.iceberg.parquet;
 
-import com.google.common.base.Preconditions;
 import com.google.common.collect.Maps;
 import java.util.Map;
+import java.util.Set;
 import java.util.function.Function;
 import org.apache.iceberg.Schema;
-import org.apache.iceberg.expressions.Binder;
-import org.apache.iceberg.expressions.BoundReference;
-import org.apache.iceberg.expressions.Expression;
-import org.apache.iceberg.expressions.ExpressionVisitors;
+import org.apache.iceberg.expressions.*;
 
 Review comment:
   Done.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402697
 
 

 ##
 File path: api/src/test/java/org/apache/iceberg/expressions/TestEvaluator.java
 ##
 @@ -362,4 +368,197 @@ public void testCharSeqValue() {
 Assert.assertFalse("string(abc) == utf8(abcd) => false",
 evaluator.eval(TestHelpers.Row.of(new Utf8("abcd";
   }
+
+  @Test
+  public void testIn() {
+Assert.assertEquals(3, in("s", 7, 8, 9).literals().size());
+Assert.assertEquals(3, in("s", 7, 8.1, Long.MAX_VALUE).literals().size());
+Assert.assertEquals(2, in("s", "abc", "abd", "abc").literals().size());
+Assert.assertEquals(0, in("s").literals().size());
+Assert.assertEquals(1, in("s", 5).literals().size());
+Assert.assertEquals(1, in("s", 5, 5).literals().size());
+Assert.assertEquals(1, in("s", Arrays.asList(5, 5)).literals().size());
+Assert.assertEquals(0, in("s", Collections.emptyList()).literals().size());
+
+Evaluator evaluator = new Evaluator(STRUCT, in("x", 7, 8, Long.MAX_VALUE));
+Assert.assertTrue("7 in [7, 8] => true", 
evaluator.eval(TestHelpers.Row.of(7, 8, null)));
+Assert.assertFalse("9 in [7, 8]  => false", 
evaluator.eval(TestHelpers.Row.of(9, 8, null)));
+
+Evaluator longEvaluator = new Evaluator(STRUCT,
 
 Review comment:
   Thanks and fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402737
 
 

 ##
 File path: api/src/test/java/org/apache/iceberg/expressions/TestEvaluator.java
 ##
 @@ -362,4 +368,197 @@ public void testCharSeqValue() {
 Assert.assertFalse("string(abc) == utf8(abcd) => false",
 evaluator.eval(TestHelpers.Row.of(new Utf8("abcd";
   }
+
+  @Test
+  public void testIn() {
+Assert.assertEquals(3, in("s", 7, 8, 9).literals().size());
+Assert.assertEquals(3, in("s", 7, 8.1, Long.MAX_VALUE).literals().size());
+Assert.assertEquals(2, in("s", "abc", "abd", "abc").literals().size());
+Assert.assertEquals(0, in("s").literals().size());
+Assert.assertEquals(1, in("s", 5).literals().size());
+Assert.assertEquals(1, in("s", 5, 5).literals().size());
+Assert.assertEquals(1, in("s", Arrays.asList(5, 5)).literals().size());
+Assert.assertEquals(0, in("s", Collections.emptyList()).literals().size());
+
+Evaluator evaluator = new Evaluator(STRUCT, in("x", 7, 8, Long.MAX_VALUE));
+Assert.assertTrue("7 in [7, 8] => true", 
evaluator.eval(TestHelpers.Row.of(7, 8, null)));
+Assert.assertFalse("9 in [7, 8]  => false", 
evaluator.eval(TestHelpers.Row.of(9, 8, null)));
+
+Evaluator longEvaluator = new Evaluator(STRUCT,
+in("x", Long.MAX_VALUE, Integer.MAX_VALUE, Long.MIN_VALUE));
+Assert.assertTrue("Integer.MAX_VALUE in [Integer.MAX_VALUE] => true",
+longEvaluator.eval(TestHelpers.Row.of(Integer.MAX_VALUE, 7.0, null)));
+Assert.assertFalse("6 in [Integer.MAX_VALUE]  => false",
+longEvaluator.eval(TestHelpers.Row.of(6, 6.8, null)));
+
+Evaluator integerEvaluator = new Evaluator(STRUCT, in("y", 7, 8, 9.1));
+Assert.assertTrue("7.0 in [7, 8, 9.1] => true", 
integerEvaluator.eval(TestHelpers.Row.of(7, 7.0, null)));
 
 Review comment:
   Done.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402641
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java
 ##
 @@ -125,13 +149,40 @@ public Expression bind(Types.StructType struct, boolean 
caseSensitive) {
 case LT_EQ:
 case EQ:
   return Expressions.alwaysFalse();
-//case IN:
-//  break;
-//case NOT_IN:
-//  break;
   }
 }
 return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
-schema.accessorForField(field.fieldId())), lit);
+schema.accessorForField(field.fieldId())), lit);
+  }
+
+  @SuppressWarnings("unchecked")
+  private Expression bindInOperation(Types.NestedField field, Schema schema) {
+final Set> lits = literals().stream().map(
+l -> {
+  Literal lit = l.to(field.type());
+  if (lit == null) {
+throw new ValidationException(String.format(
+"Invalid value for comparison inclusive type %s: %s (%s)",
+field.type(), l.value(), l.value().getClass().getName()));
+  }
+  return lit;
+})
+.filter(l -> l != Literals.aboveMax() && l != Literals.belowMin())
+.collect(Collectors.toSet());
+
+if (lits.isEmpty()) {
+  return Expressions.alwaysFalse();
+} else if (lits.size() == 1) {
+  return new BoundPredicate<>(Operation.EQ, new 
BoundReference<>(field.fieldId(),
+  schema.accessorForField(field.fieldId())), lits.iterator().next());
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402469
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/BoundSetPredicate.java
 ##
 @@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.Set;
+import org.apache.iceberg.util.CharSequenceWrapper;
+
+public class BoundSetPredicate extends Predicate> {
+  private final LiteralSet literalSet;
+
+  BoundSetPredicate(Operation op, BoundReference ref, Set> lits) 
{
+super(op, ref);
+Preconditions.checkArgument(op == Operation.IN || op == Operation.NOT_IN,
+"%s predicate does not support a set of literals", op);
+this.literalSet = new LiteralSet<>(lits);
+  }
+
+  BoundSetPredicate(Operation op, BoundReference ref, LiteralSet lits) {
+super(op, ref);
+Preconditions.checkArgument(op == Operation.IN || op == Operation.NOT_IN,
+"%s predicate does not support a literal set", op);
+this.literalSet = lits;
+  }
+
+  @Override
+  public Expression negate() {
+return new BoundSetPredicate<>(op().negate(), ref(), literalSet);
+  }
+
+  public Set literalSet() {
+return literalSet;
+  }
+
+  @Override
+  String literalString() {
+return literalSet.toString();
+  }
+
+  /**
+   * Represents a set of literal values in an IN or NOT_IN predicate
+   * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+   */
+  private static class LiteralSet implements Set, Serializable {
+private final Set values;
+
+@SuppressWarnings("unchecked")
+LiteralSet(Set> lits) {
+  Preconditions.checkArgument(lits == null || lits.size() > 1,
+  "The input literal set must include more than 1 element.");
+  values = ImmutableSet.builder().addAll(
+  lits.stream().map(
+  lit -> {
+if (lit instanceof Literals.StringLiteral) {
+  return (T) 
CharSequenceWrapper.wrap(((Literals.StringLiteral) lit).value());
+} else {
+  return lit.value();
+}
+  }
+  ).iterator()).build();
+}
+
+@Override
+public String toString() {
+  return Joiner.on(", ").join(values);
+}
+
+@Override
+public boolean contains(Object object) {
+  if (object instanceof CharSequence) {
+return values.contains(CharSequenceWrapper.wrap((CharSequence) 
object));
+  }
+  return values.contains(object);
+}
+
+@Override
+public int size() {
+  return values.size();
+}
+
+@Override
+public boolean isEmpty() {
+  return values.isEmpty();
+}
+
+@Override
+@SuppressWarnings("unchecked")
+public Iterator iterator() {
+  return values.stream().map(
+  val -> {
+if (val instanceof CharSequenceWrapper) {
+  return (T) ((CharSequenceWrapper) val).get();
+} else {
+  return val;
+}
+  }).iterator();
+}
+
+@Override
+public boolean containsAll(Collection c) {
+  throw new UnsupportedOperationException(
+  "LiteralSet currently only supports checking if a single item is 
contained in it.");
+}
+
+@Override
+public Object[] toArray() {
+  throw new UnsupportedOperationException(
+  "Please use iterator() to visit the elements in the set.");
+}
+
+@Override
+public  X[] toArray(X[] a) {
+  throw new UnsupportedOperationException(
+  "Please use iterator() to visit the elements in the set.");
+}
+
+@Override
+public boolean add(T t) {
+  throw new UnsupportedOperationException(
+  "The set is immutable and cannot add an element.");
+}
+
+@Override
+public boolean remove(Object o) {
+  throw new UnsupportedOperationException(
+  "The set is immutable 

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402583
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java
 ##
 @@ -125,13 +149,40 @@ public Expression bind(Types.StructType struct, boolean 
caseSensitive) {
 case LT_EQ:
 case EQ:
   return Expressions.alwaysFalse();
-//case IN:
-//  break;
-//case NOT_IN:
-//  break;
   }
 }
 return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
-schema.accessorForField(field.fieldId())), lit);
+schema.accessorForField(field.fieldId())), lit);
 
 Review comment:
   Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402556
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Predicate.java
 ##
 @@ -39,37 +37,35 @@ public R ref() {
 return ref;
   }
 
-  public Literal literal() {
-return literal;
-  }
+  abstract String literalString();
 
   @Override
   public String toString() {
-switch (op) {
+switch (op()) {
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r318402380
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/BoundSetPredicate.java
 ##
 @@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.Set;
+import org.apache.iceberg.util.CharSequenceWrapper;
+
+public class BoundSetPredicate extends Predicate> {
+  private final LiteralSet literalSet;
+
+  BoundSetPredicate(Operation op, BoundReference ref, Set> lits) 
{
+super(op, ref);
+Preconditions.checkArgument(op == Operation.IN || op == Operation.NOT_IN,
+"%s predicate does not support a set of literals", op);
+this.literalSet = new LiteralSet<>(lits);
+  }
+
+  BoundSetPredicate(Operation op, BoundReference ref, LiteralSet lits) {
+super(op, ref);
+Preconditions.checkArgument(op == Operation.IN || op == Operation.NOT_IN,
+"%s predicate does not support a literal set", op);
+this.literalSet = lits;
+  }
+
+  @Override
+  public Expression negate() {
+return new BoundSetPredicate<>(op().negate(), ref(), literalSet);
+  }
+
+  public Set literalSet() {
+return literalSet;
+  }
+
+  @Override
+  String literalString() {
+return literalSet.toString();
+  }
+
+  /**
+   * Represents a set of literal values in an IN or NOT_IN predicate
+   * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+   */
+  private static class LiteralSet implements Set, Serializable {
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-27 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r317910185
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/ExpressionVisitors.java
 ##
 @@ -51,6 +53,10 @@ public R or(R leftResult, R rightResult) {
   return null;
 }
 
+public  R predicate(BoundSetPredicate pred) {
+  return null;
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315009996
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/ExpressionVisitors.java
 ##
 @@ -89,12 +93,12 @@ public R or(R leftResult, R rightResult) {
   return null;
 }
 
-public  R in(BoundReference ref, Literal lit) {
-  return null;
+public  R in(BoundReference ref, LiteralSet literalSet) {
 
 Review comment:
   As only `BoundSetPredicate` can access `LiteralSet`, I will make 
`LiteralSet` as `BoundSetPredicate`'s private inner class.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315009526
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java
 ##
 @@ -125,13 +180,65 @@ public Expression bind(Types.StructType struct, boolean 
caseSensitive) {
 case LT_EQ:
 case EQ:
   return Expressions.alwaysFalse();
-//case IN:
-//  break;
-//case NOT_IN:
-//  break;
   }
 }
 return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
-schema.accessorForField(field.fieldId())), lit);
+schema.accessorForField(field.fieldId())), lit);
+  }
+
+  @SuppressWarnings("unchecked")
+  private Expression bindInOperation(Types.NestedField field, Schema schema) {
+final Set> lits = literals().stream().map(
+l -> {
+  Literal lit = l.to(field.type());
+  if (lit == null) {
+throw new ValidationException(String.format(
+"Invalid value for comparison inclusive type %s: %s (%s)",
+field.type(), l.value(), l.value().getClass().getName()));
+  }
+  return lit;
+})
+.filter(l -> l != Literals.aboveMax() && l != Literals.belowMin())
+.collect(Collectors.toSet());
+
+if (lits.isEmpty()) {
+  return Expressions.alwaysFalse();
+} else if (lits.size() == 1) {
+  return new BoundPredicate<>(Operation.EQ, new 
BoundReference<>(field.fieldId(),
+  schema.accessorForField(field.fieldId())), lits.iterator().next());
+} else {
+  return new BoundSetPredicate<>(Operation.IN, new 
BoundReference<>(field.fieldId(),
+  schema.accessorForField(field.fieldId())), lits);
+}
+  }
+
+  @Override
+  public String toString() {
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315009493
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/util/CharSequenceWrapper.java
 ##
 @@ -49,21 +50,22 @@ public boolean equals(Object other) {
 if (this == other) {
   return true;
 }
-if (other == null || getClass() != other.getClass()) {
+if (other == null) {
   return false;
 }
 
-CharSequenceWrapper that = (CharSequenceWrapper) other;
-return Comparators.charSequences().compare(wrapped, that.wrapped) == 0;
+if (other instanceof CharSequence) {
+  return Comparators.charSequences().compare(wrapped, (CharSequence) 
other) == 0;
+} else if (other instanceof CharSequenceWrapper) {
+  CharSequenceWrapper that = (CharSequenceWrapper) other;
+  return Comparators.charSequences().compare(wrapped, that.wrapped) == 0;
+} else {
+  return false;
+}
   }
 
   @Override
   public int hashCode() {
-int result = 177;
-for (int i = 0; i < wrapped.length(); i += 1) {
-  char ch = wrapped.charAt(i);
-  result = 31 * result + (int) ch;
-}
-return result;
+return wrapped.hashCode();
 
 Review comment:
   Thanks for the comments. Done.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315007006
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java
 ##
 @@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.Set;
+import org.apache.iceberg.util.CharSequenceWrapper;
+
+/**
+ * Represents a set of literal values in an IN or NOT_IN predicate
+ * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+ */
+class LiteralSet implements Set, Serializable {
+  private final Set values;
+
+  @SuppressWarnings("unchecked")
+  LiteralSet(Set> lits) {
+Preconditions.checkArgument(lits == null || lits.size() > 1,
+"The input literal set must include more than 1 element.");
+values = ImmutableSet.builder().addAll(
+lits.stream().map(
+lit -> {
+  if (lit instanceof Literals.StringLiteral) {
+return (T) CharSequenceWrapper.wrap((CharSequence) 
lit.value());
 
 Review comment:
   Update the code, but due to `(T)`, still need suppressing unchecked warning.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315006857
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java
 ##
 @@ -109,16 +111,46 @@ public static Expression not(Expression child) {
 return new UnboundPredicate<>(Expression.Operation.STARTS_WITH, ref(name), 
value);
   }
 
+  public static  UnboundPredicate in(String name, T value, T... values) {
+return predicate(Operation.IN, name,
+Stream.concat(Stream.of(value), Stream.of(values))
+.map(Literals::from).collect(Collectors.toSet()));
+  }
+
+  public static  UnboundPredicate notIn(String name, T value, T... 
values) {
 
 Review comment:
   Done to add another factory method accepting `Collection`.
   The conversion is deferred to `binding`, here can just return 
`UnboundPredicate`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315006857
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java
 ##
 @@ -109,16 +111,46 @@ public static Expression not(Expression child) {
 return new UnboundPredicate<>(Expression.Operation.STARTS_WITH, ref(name), 
value);
   }
 
+  public static  UnboundPredicate in(String name, T value, T... values) {
+return predicate(Operation.IN, name,
+Stream.concat(Stream.of(value), Stream.of(values))
+.map(Literals::from).collect(Collectors.toSet()));
+  }
+
+  public static  UnboundPredicate notIn(String name, T value, T... 
values) {
 
 Review comment:
   Done.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315006537
 
 

 ##
 File path: 
parquet/src/main/java/org/apache/iceberg/parquet/ParquetDictionaryRowGroupFilter.java
 ##
 @@ -285,12 +284,12 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
+public  Boolean in(BoundReference ref, LiteralSet literalSet) {
   return ROWS_MIGHT_MATCH;
 
 Review comment:
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315006529
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java
 ##
 @@ -125,13 +154,38 @@ public Expression bind(Types.StructType struct, boolean 
caseSensitive) {
 case LT_EQ:
 case EQ:
   return Expressions.alwaysFalse();
-//case IN:
-//  break;
-//case NOT_IN:
-//  break;
   }
 }
 return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
-schema.accessorForField(field.fieldId())), lit);
+schema.accessorForField(field.fieldId())), lit);
+  }
+
+  @SuppressWarnings("unchecked")
+  private Expression bindInOperation(Types.NestedField field, Schema schema) {
+final Set> lits = literalSet().stream().map(
+val -> {
+  if (val instanceof LiteralSet.CharSeqWrapper) {
+val = (T) ((LiteralSet.CharSeqWrapper) val).unWrap();
+  }
+  Literal lit = Literals.from(val).to(field.type());
+  if (lit == null) {
+throw new ValidationException(String.format(
+"Invalid value for comparison inclusive type %s: %s (%s)",
+field.type(), val, val.getClass().getName()));
+  }
+  return lit;
+})
+.filter(l -> l != Literals.aboveMax() && l != Literals.belowMin())
+.collect(Collectors.toSet());
 
 Review comment:
   Thanks for the suggestion. Updated the code accordingly.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-18 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r315006503
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Literals.java
 ##
 @@ -109,6 +110,26 @@ public T value() {
 public String toString() {
   return String.valueOf(value);
 }
+
+@Override
+@SuppressWarnings("unchecked")
+public boolean equals(Object other) {
 
 Review comment:
   Currently, it has been used in two places.
   - In the factory methods in Expressions, when a collection is passed into 
`in` or `not_in`, they will be converted to Literal and then to a Set.
   - In binding, after all literals in Unbound Predicate is converted to the 
type, it is used for deduplication.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-16 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314591877
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Literals.java
 ##
 @@ -109,6 +110,26 @@ public T value() {
 public String toString() {
   return String.valueOf(value);
 }
+
+@Override
+@SuppressWarnings("unchecked")
+public boolean equals(Object other) {
 
 Review comment:
   Yes, it is used for deduplication. Additionally, it might be better to 
override `equals` to be consistent with the behavior of the comparison.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314586632
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java
 ##
 @@ -109,16 +111,46 @@ public static Expression not(Expression child) {
 return new UnboundPredicate<>(Expression.Operation.STARTS_WITH, ref(name), 
value);
   }
 
+  public static  UnboundPredicate in(String name, T value, T... values) {
+return predicate(Operation.IN, name,
+Stream.concat(Stream.of(value), Stream.of(values))
+.map(Literals::from).collect(Collectors.toSet()));
+  }
+
+  public static  UnboundPredicate notIn(String name, T value, T... 
values) {
 
 Review comment:
   @rdblue 
   Thanks for the comment, I will make it to be `T...` here.
   
   I still think it may be better to throw exception instead of return 
`alwaysTrue` because that will force the method to return `Expression`. When 
users use it, they have to explicitly cast it to the concrete class, which 
makes it not-easy to use. 
   
   Please let me know your thoughts. thanks.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314589849
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java
 ##
 @@ -125,13 +154,38 @@ public Expression bind(Types.StructType struct, boolean 
caseSensitive) {
 case LT_EQ:
 case EQ:
   return Expressions.alwaysFalse();
-//case IN:
-//  break;
-//case NOT_IN:
-//  break;
   }
 }
 return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
-schema.accessorForField(field.fieldId())), lit);
+schema.accessorForField(field.fieldId())), lit);
+  }
+
+  @SuppressWarnings("unchecked")
+  private Expression bindInOperation(Types.NestedField field, Schema schema) {
+final Set> lits = literalSet().stream().map(
+val -> {
+  if (val instanceof LiteralSet.CharSeqWrapper) {
+val = (T) ((LiteralSet.CharSeqWrapper) val).unWrap();
+  }
+  Literal lit = Literals.from(val).to(field.type());
 
 Review comment:
   Sure, I will change it.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314589687
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Predicate.java
 ##
 @@ -19,15 +19,49 @@
 
 package org.apache.iceberg.expressions;
 
+import com.google.common.base.Preconditions;
+import java.util.Set;
+
 public abstract class Predicate implements Expression {
   private final Operation op;
   private final R ref;
   private final Literal literal;
+  private final LiteralSet literalSet;
 
 Review comment:
   Thanks for the comments. Updated the classes accordingly.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314588864
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java
 ##
 @@ -125,13 +154,38 @@ public Expression bind(Types.StructType struct, boolean 
caseSensitive) {
 case LT_EQ:
 case EQ:
   return Expressions.alwaysFalse();
-//case IN:
-//  break;
-//case NOT_IN:
-//  break;
   }
 }
 return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
-schema.accessorForField(field.fieldId())), lit);
+schema.accessorForField(field.fieldId())), lit);
+  }
+
+  @SuppressWarnings("unchecked")
+  private Expression bindInOperation(Types.NestedField field, Schema schema) {
+final Set> lits = literalSet().stream().map(
+val -> {
+  if (val instanceof LiteralSet.CharSeqWrapper) {
+val = (T) ((LiteralSet.CharSeqWrapper) val).unWrap();
+  }
+  Literal lit = Literals.from(val).to(field.type());
+  if (lit == null) {
+throw new ValidationException(String.format(
+"Invalid value for comparison inclusive type %s: %s (%s)",
+field.type(), val, val.getClass().getName()));
+  }
+  return lit;
+})
+.filter(l -> l != Literals.aboveMax() && l != Literals.belowMin())
+.collect(Collectors.toSet());
 
 Review comment:
   @rdblue The main reason to make it to be a `Set` is for deduplication.
   For example,
   ```
   StructType struct = StructType.of(required(15, "d", Types.DecimalType.of(9, 
2)));
   UnboundPredicate unbound = Expressions.in("d", 12.40, 12.401, 
12.402);
   Expression expr = unbound.bind(struct);
   // expr should be `BoundPredicate` with `EQ` op.
   ```
   So here, the size of literalSet might shrink due to the type conversion.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314586632
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java
 ##
 @@ -109,16 +111,46 @@ public static Expression not(Expression child) {
 return new UnboundPredicate<>(Expression.Operation.STARTS_WITH, ref(name), 
value);
   }
 
+  public static  UnboundPredicate in(String name, T value, T... values) {
+return predicate(Operation.IN, name,
+Stream.concat(Stream.of(value), Stream.of(values))
+.map(Literals::from).collect(Collectors.toSet()));
+  }
+
+  public static  UnboundPredicate notIn(String name, T value, T... 
values) {
 
 Review comment:
   @rdblue 
   Thanks for the comment, I can make it to be `T...` here.
   
   I still think it may be better to throw exception instead of return 
`alwaysTrue` because that will force the method to return `Expression`. When 
users use it, they have to explicitly cast it to the concrete class, which 
makes it not-easy to use. 
   
   Please let me know your thoughts. thanks.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314585849
 
 

 ##
 File path: 
parquet/src/main/java/org/apache/iceberg/parquet/ParquetDictionaryRowGroupFilter.java
 ##
 @@ -285,12 +284,12 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
+public  Boolean in(BoundReference ref, LiteralSet literalSet) {
   return ROWS_MIGHT_MATCH;
 
 Review comment:
   Yep, I agree. 
   I think It is probably better to be implemented in the next PR, which I am 
going to work on it after this one.
   Please let me know your thoughts. thanks.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314585643
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/ManifestEvaluator.java
 ##
 @@ -245,12 +245,12 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
+public  Boolean in(BoundReference ref, LiteralSet literalSet) {
   return ROWS_MIGHT_MATCH;
 
 Review comment:
   @rdblue 
   It is probably better to be implemented in the next PR, which I am going to 
work on it after this one.
   Please let me know your thoughts. thanks.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314585257
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.iceberg.types.Comparators;
+
+/**
+ * Represents a set of literal values in an IN or NOT_IN predicate
+ * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+ */
+public class LiteralSet implements Set, Serializable {
+  private final Set literals;
+
+  @SuppressWarnings("unchecked")
+  LiteralSet(Set> lits) {
+Preconditions.checkArgument(lits == null || lits.size() > 1,
+"The input literal set must include more than 1 element.");
+literals = ImmutableSet.builder().addAll(
+lits.stream().map(
+lit -> {
+  if (lit instanceof Literals.StringLiteral) {
+return (T) new CharSeqWrapper((CharSequence) lit.value());
+  } else {
+return lit.value();
+  }
+}
+).iterator()).build();
+  }
+
+  @Override
+  public String toString() {
+Iterator it = literals.iterator();
+if (!it.hasNext()) {
+  return "{}";
+}
+
+StringBuilder sb = new StringBuilder();
+sb.append('{');
+while (true) {
+  sb.append(it.next());
+  if (!it.hasNext()) {
+return sb.append('}').toString();
+  }
+  sb.append(',').append(' ');
+}
+  }
+
+  @Override
+  @SuppressWarnings("unchecked")
+  public boolean equals(Object other) {
+if (this == other) {
+  return true;
+}
+if (other == null || getClass() != other.getClass()) {
+  return false;
+}
+LiteralSet that = (LiteralSet) other;
+return literals.equals(that.literals);
+  }
+
+  @Override
+  public int hashCode() {
+return Objects.hashCode(literals);
+  }
+
+  @Override
+  public boolean contains(Object object) {
+return literals.contains(object);
+  }
+
+  @Override
+  public int size() {
+return literals.size();
+  }
+
+  @Override
+  public boolean isEmpty() {
+return literals.isEmpty();
+  }
+
+  @Override
+  public Iterator iterator() {
+return literals.iterator();
+  }
+
+  @Override
+  public Object[] toArray() {
+return literals.toArray();
+  }
+
+  @Override
+  public  X[] toArray(X[] a) {
+return literals.toArray(a);
+  }
+
+  @Override
+  public boolean containsAll(Collection c) {
+return literals.containsAll(c);
+  }
+
+  @Override
+  public boolean add(T t) {
+throw new UnsupportedOperationException();
 
 Review comment:
   Thanks for the suggestions. Added the descriptions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314584436
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.iceberg.types.Comparators;
+
+/**
+ * Represents a set of literal values in an IN or NOT_IN predicate
+ * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+ */
+public class LiteralSet implements Set, Serializable {
+  private final Set literals;
+
+  @SuppressWarnings("unchecked")
+  LiteralSet(Set> lits) {
+Preconditions.checkArgument(lits == null || lits.size() > 1,
+"The input literal set must include more than 1 element.");
+literals = ImmutableSet.builder().addAll(
+lits.stream().map(
+lit -> {
+  if (lit instanceof Literals.StringLiteral) {
+return (T) new CharSeqWrapper((CharSequence) lit.value());
+  } else {
+return lit.value();
+  }
+}
+).iterator()).build();
+  }
+
+  @Override
+  public String toString() {
+Iterator it = literals.iterator();
+if (!it.hasNext()) {
+  return "{}";
+}
+
+StringBuilder sb = new StringBuilder();
+sb.append('{');
+while (true) {
+  sb.append(it.next());
+  if (!it.hasNext()) {
+return sb.append('}').toString();
+  }
+  sb.append(',').append(' ');
+}
+  }
+
+  @Override
+  @SuppressWarnings("unchecked")
+  public boolean equals(Object other) {
+if (this == other) {
+  return true;
+}
+if (other == null || getClass() != other.getClass()) {
+  return false;
+}
+LiteralSet that = (LiteralSet) other;
+return literals.equals(that.literals);
+  }
+
+  @Override
+  public int hashCode() {
+return Objects.hashCode(literals);
+  }
+
+  @Override
+  public boolean contains(Object object) {
+return literals.contains(object);
+  }
+
+  @Override
+  public int size() {
+return literals.size();
+  }
+
+  @Override
+  public boolean isEmpty() {
+return literals.isEmpty();
+  }
+
+  @Override
+  public Iterator iterator() {
+return literals.iterator();
+  }
+
+  @Override
+  public Object[] toArray() {
+return literals.toArray();
+  }
+
+  @Override
+  public  X[] toArray(X[] a) {
+return literals.toArray(a);
 
 Review comment:
   Yep, It should. 
   As `toArray` methods are not used, I will make them to throw exceptions. 
   Then, in the iterator, I added the logic to unwrap the `CharSequence`.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314584147
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/ExpressionVisitors.java
 ##
 @@ -89,12 +93,12 @@ public R or(R leftResult, R rightResult) {
   return null;
 }
 
-public  R in(BoundReference ref, Literal lit) {
-  return null;
+public  R in(BoundReference ref, LiteralSet literalSet) {
 
 Review comment:
   Yep, thanks for the comments. Updated them accordingly. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314583939
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.iceberg.types.Comparators;
+
+/**
+ * Represents a set of literal values in an IN or NOT_IN predicate
+ * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+ */
+public class LiteralSet implements Set, Serializable {
+  private final Set literals;
+
+  @SuppressWarnings("unchecked")
+  LiteralSet(Set> lits) {
+Preconditions.checkArgument(lits == null || lits.size() > 1,
+"The input literal set must include more than 1 element.");
+literals = ImmutableSet.builder().addAll(
+lits.stream().map(
+lit -> {
+  if (lit instanceof Literals.StringLiteral) {
+return (T) new CharSeqWrapper((CharSequence) lit.value());
+  } else {
+return lit.value();
+  }
+}
+).iterator()).build();
+  }
+
+  @Override
+  public String toString() {
+Iterator it = literals.iterator();
+if (!it.hasNext()) {
+  return "{}";
+}
+
+StringBuilder sb = new StringBuilder();
+sb.append('{');
+while (true) {
+  sb.append(it.next());
+  if (!it.hasNext()) {
+return sb.append('}').toString();
+  }
+  sb.append(',').append(' ');
+}
+  }
+
+  @Override
+  @SuppressWarnings("unchecked")
+  public boolean equals(Object other) {
+if (this == other) {
+  return true;
+}
+if (other == null || getClass() != other.getClass()) {
+  return false;
+}
+LiteralSet that = (LiteralSet) other;
+return literals.equals(that.literals);
+  }
+
+  @Override
+  public int hashCode() {
+return Objects.hashCode(literals);
+  }
+
+  @Override
+  public boolean contains(Object object) {
+return literals.contains(object);
+  }
+
+  @Override
+  public int size() {
+return literals.size();
+  }
+
+  @Override
+  public boolean isEmpty() {
+return literals.isEmpty();
+  }
+
+  @Override
+  public Iterator iterator() {
+return literals.iterator();
+  }
+
+  @Override
+  public Object[] toArray() {
+return literals.toArray();
+  }
+
+  @Override
+  public  X[] toArray(X[] a) {
+return literals.toArray(a);
+  }
+
+  @Override
+  public boolean containsAll(Collection c) {
+return literals.containsAll(c);
+  }
+
+  @Override
+  public boolean add(T t) {
+throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public boolean remove(Object o) {
+throw new UnsupportedOperationException();
+  }
+
+
+  @Override
+  public boolean addAll(Collection c) {
+throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public boolean retainAll(Collection c) {
+throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public boolean removeAll(Collection c) {
+throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public void clear() {
+throw new UnsupportedOperationException();
+  }
+
+  static class CharSeqWrapper implements CharSequence, Serializable {
+private static final Comparator CMP =
+
Comparators.nullsFirst().thenComparing(Comparators.charSequences());
+
+private final CharSequence chars;
+
+CharSeqWrapper(CharSequence charSequence) {
+  Preconditions.checkNotNull(charSequence, "String literal values cannot 
be null");
+  this.chars = charSequence;
+}
+
+@Override
+public int length() {
+  return chars.length();
+}
+
+@Override
+public char charAt(int index) {
+  return chars.charAt(index);
+}
+
+@Override
+public CharSequence subSequence(int start, int end) {
+  return 

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314583886
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.iceberg.types.Comparators;
+
+/**
+ * Represents a set of literal values in an IN or NOT_IN predicate
+ * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+ */
+public class LiteralSet implements Set, Serializable {
+  private final Set literals;
+
+  @SuppressWarnings("unchecked")
+  LiteralSet(Set> lits) {
+Preconditions.checkArgument(lits == null || lits.size() > 1,
+"The input literal set must include more than 1 element.");
+literals = ImmutableSet.builder().addAll(
+lits.stream().map(
+lit -> {
+  if (lit instanceof Literals.StringLiteral) {
+return (T) new CharSeqWrapper((CharSequence) lit.value());
+  } else {
+return lit.value();
+  }
+}
+).iterator()).build();
+  }
+
+  @Override
+  public String toString() {
+Iterator it = literals.iterator();
+if (!it.hasNext()) {
+  return "{}";
+}
+
+StringBuilder sb = new StringBuilder();
+sb.append('{');
+while (true) {
+  sb.append(it.next());
+  if (!it.hasNext()) {
+return sb.append('}').toString();
+  }
+  sb.append(',').append(' ');
+}
+  }
+
+  @Override
+  @SuppressWarnings("unchecked")
+  public boolean equals(Object other) {
+if (this == other) {
+  return true;
+}
+if (other == null || getClass() != other.getClass()) {
+  return false;
+}
+LiteralSet that = (LiteralSet) other;
+return literals.equals(that.literals);
+  }
+
+  @Override
+  public int hashCode() {
+return Objects.hashCode(literals);
+  }
+
+  @Override
+  public boolean contains(Object object) {
+return literals.contains(object);
+  }
+
+  @Override
+  public int size() {
+return literals.size();
+  }
+
+  @Override
+  public boolean isEmpty() {
+return literals.isEmpty();
+  }
+
+  @Override
+  public Iterator iterator() {
+return literals.iterator();
+  }
+
+  @Override
+  public Object[] toArray() {
+return literals.toArray();
+  }
+
+  @Override
+  public  X[] toArray(X[] a) {
+return literals.toArray(a);
+  }
+
+  @Override
+  public boolean containsAll(Collection c) {
+return literals.containsAll(c);
+  }
+
+  @Override
+  public boolean add(T t) {
+throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public boolean remove(Object o) {
+throw new UnsupportedOperationException();
+  }
+
+
+  @Override
+  public boolean addAll(Collection c) {
+throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public boolean retainAll(Collection c) {
+throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public boolean removeAll(Collection c) {
+throw new UnsupportedOperationException();
+  }
+
+  @Override
+  public void clear() {
+throw new UnsupportedOperationException();
+  }
+
+  static class CharSeqWrapper implements CharSequence, Serializable {
 
 Review comment:
   Thanks for the suggestion! 
   Will remove `CharSeqWrapper` and reuse the `CharSequenceWrapper` instead.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314583647
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.iceberg.types.Comparators;
+
+/**
+ * Represents a set of literal values in an IN or NOT_IN predicate
+ * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+ */
+public class LiteralSet implements Set, Serializable {
+  private final Set literals;
+
+  @SuppressWarnings("unchecked")
+  LiteralSet(Set> lits) {
+Preconditions.checkArgument(lits == null || lits.size() > 1,
+"The input literal set must include more than 1 element.");
+literals = ImmutableSet.builder().addAll(
+lits.stream().map(
+lit -> {
+  if (lit instanceof Literals.StringLiteral) {
+return (T) new CharSeqWrapper((CharSequence) lit.value());
+  } else {
+return lit.value();
+  }
+}
+).iterator()).build();
+  }
+
+  @Override
+  public String toString() {
+Iterator it = literals.iterator();
+if (!it.hasNext()) {
+  return "{}";
+}
+
+StringBuilder sb = new StringBuilder();
+sb.append('{');
+while (true) {
+  sb.append(it.next());
+  if (!it.hasNext()) {
+return sb.append('}').toString();
+  }
+  sb.append(',').append(' ');
+}
+  }
+
+  @Override
+  @SuppressWarnings("unchecked")
+  public boolean equals(Object other) {
 
 Review comment:
   Thanks for the comments.
   That has been used in `TestExpressionSerialization` to check the equality of 
two predicates.
   As that is the only place using it, I am going to move it inside 
`TestExpressionSerialization` and remove `equals` and `hashCode` from the 
`LiteralSet`.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314505113
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java
 ##
 @@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.expressions;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.iceberg.types.Comparators;
+
+/**
+ * Represents a set of literal values in an IN or NOT_IN predicate
+ * @param  The Java type of the value, which can be wrapped by a {@link 
Literal}
+ */
+public class LiteralSet implements Set, Serializable {
+  private final Set literals;
+
+  @SuppressWarnings("unchecked")
+  LiteralSet(Set> lits) {
+Preconditions.checkArgument(lits == null || lits.size() > 1,
+"The input literal set must include more than 1 element.");
+literals = ImmutableSet.builder().addAll(
+lits.stream().map(
+lit -> {
+  if (lit instanceof Literals.StringLiteral) {
+return (T) new CharSeqWrapper((CharSequence) lit.value());
+  } else {
+return lit.value();
+  }
+}
+).iterator()).build();
+  }
+
+  @Override
+  public String toString() {
+Iterator it = literals.iterator();
 
 Review comment:
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313653184
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,18 +134,18 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, LiteralSet literalSet) {
+  return literalSet.contains(ref.get(struct));
 }
 
 @Override
-public  Boolean notIn(BoundReference ref, Literal lit) {
-  return !in(ref, lit);
+public  Boolean notIn(BoundReference ref, LiteralSet literalSet) {
+  return !in(ref, literalSet);
 }
 
 @Override
 public  Boolean startsWith(BoundReference ref, Literal lit) {
-  return ((String) ref.get(struct)).startsWith((String) lit.value());
+  return ((String) ref.get(struct)).startsWith(lit.value().toString());
 
 Review comment:
   Here, I changed the construction of `StringLiteral` to be
   ```
   StringLiteral(CharSequence value) {
 super(new CharSeqWrapper(value));
   }
   ```
   
   The other way is to only wrap `CharSequence` within LiteralSet. In that way, 
there is no need to change `startsWith`.
   
   Wrapping inside seems cleaner with less side effect and I will update the PR 
accordingly.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313653184
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,18 +134,18 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, LiteralSet literalSet) {
+  return literalSet.contains(ref.get(struct));
 }
 
 @Override
-public  Boolean notIn(BoundReference ref, Literal lit) {
-  return !in(ref, lit);
+public  Boolean notIn(BoundReference ref, LiteralSet literalSet) {
+  return !in(ref, literalSet);
 }
 
 @Override
 public  Boolean startsWith(BoundReference ref, Literal lit) {
-  return ((String) ref.get(struct)).startsWith((String) lit.value());
+  return ((String) ref.get(struct)).startsWith(lit.value().toString());
 
 Review comment:
   Here, I changed the construction of `StringLiteral` to be
   ```
   StringLiteral(CharSequence value) {
 super(new CharSeqWrapper(value));
   }
   ```
   
   The other way is to only wrap `CharSequence` within LiteralSet. In that way, 
there is no need to change `startsWith`.
   
   This seems cleaner with less side effect and I will update the PR 
accordingly.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313653184
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,18 +134,18 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, LiteralSet literalSet) {
+  return literalSet.contains(ref.get(struct));
 }
 
 @Override
-public  Boolean notIn(BoundReference ref, Literal lit) {
-  return !in(ref, lit);
+public  Boolean notIn(BoundReference ref, LiteralSet literalSet) {
+  return !in(ref, literalSet);
 }
 
 @Override
 public  Boolean startsWith(BoundReference ref, Literal lit) {
-  return ((String) ref.get(struct)).startsWith((String) lit.value());
+  return ((String) ref.get(struct)).startsWith(lit.value().toString());
 
 Review comment:
   Here, I changed the construction of `StringLiteral` to be
   ```StringLiteral(CharSequence value) {
 super(new CharSeqWrapper(value));
   }```
   
   The other way is to only wrap `CharSequence` within LiteralSet. In that way, 
there is no need to change `startsWith`.
   
   This seems cleaner with less side effect and I will update the PR 
accordingly.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313653184
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,18 +134,18 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, LiteralSet literalSet) {
+  return literalSet.contains(ref.get(struct));
 }
 
 @Override
-public  Boolean notIn(BoundReference ref, Literal lit) {
-  return !in(ref, lit);
+public  Boolean notIn(BoundReference ref, LiteralSet literalSet) {
+  return !in(ref, literalSet);
 }
 
 @Override
 public  Boolean startsWith(BoundReference ref, Literal lit) {
-  return ((String) ref.get(struct)).startsWith((String) lit.value());
+  return ((String) ref.get(struct)).startsWith(lit.value().toString());
 
 Review comment:
   Here, I changed the construction of `StringLiteral` to be
   ```StringLiteral(CharSequence value) {
 super(new CharSeqWrapper(value));
   }
   ```
   
   The other way is to only wrap `CharSequence` within LiteralSet. In that way, 
there is no need to change `startsWith`.
   
   This seems cleaner with less side effect and I will update the PR 
accordingly.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313593900
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Predicate.java
 ##
 @@ -19,15 +19,44 @@
 
 package org.apache.iceberg.expressions;
 
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.util.Collection;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
 public abstract class Predicate implements Expression {
   private final Operation op;
   private final R ref;
   private final Literal literal;
+  private final Set> literalSet;
 
 Review comment:
   @rdblue Sorry for the confusion.
   
   In this change, only factory methods in `Expressions` and the `bind` in 
`UnboundPredicate` can change the returned predicate.
   
   What I meant is that the factory method `in` will return an `EQ` predicate 
if only a single element is provided because UnboundPredicate factory method 
for IN operation with a single item should return `EQ` predicate.
   
   As the `bind` returns `Expression`, it is straightforward to separated the 
`BoundPredicate`.
   
   But in the case of the factory method, if it also returns `Expression` or a 
new interface class instead of `UnboundPredicate`, then later when it is used, 
it has to be explicitly cast by checking the `operation`. This seems 
over-complicated compared to the current approach.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313593900
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Predicate.java
 ##
 @@ -19,15 +19,44 @@
 
 package org.apache.iceberg.expressions;
 
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.util.Collection;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
 public abstract class Predicate implements Expression {
   private final Operation op;
   private final R ref;
   private final Literal literal;
+  private final Set> literalSet;
 
 Review comment:
   @rdblue Sorry for the misunderstanding.
   
   In this change, only factory methods in `Expressions` and the `bind` in 
`UnboundPredicate` can change the returned predicate.
   
   What I meant is that the factory method `in` will return an `EQ` predicate 
if only a single element is provided because UnboundPredicate factory method 
for IN operation with a single item should return `EQ` predicate.
   
   As the `bind` returns `Expression`, it is straightforward to separated the 
`BoundPredicate`.
   
   But in the case of the factory method, if it also returns `Expression` or a 
new interface class instead of `UnboundPredicate`, then later when it is used, 
it has to be explicitly cast by checking the `operation`. This seems 
over-complicated compared to the current approach.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313576493
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,18 +134,18 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, LiteralSet literalSet) {
+  return literalSet.contains(ref.get(struct));
 }
 
 @Override
-public  Boolean notIn(BoundReference ref, Literal lit) {
-  return !in(ref, lit);
+public  Boolean notIn(BoundReference ref, LiteralSet literalSet) {
+  return !in(ref, literalSet);
 }
 
 @Override
 public  Boolean startsWith(BoundReference ref, Literal lit) {
-  return ((String) ref.get(struct)).startsWith((String) lit.value());
+  return ((String) ref.get(struct)).startsWith(lit.value().toString());
 
 Review comment:
   The issue is that `lit.value()` in this case will return a `CharSeqWrapper` 
instead of `String`. It cannot be cast to `String`.
   If the underlying is a `String`, `toString()` will still return it without 
extra costs.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313463995
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/ExpressionVisitors.java
 ##
 @@ -89,11 +91,11 @@ public R or(R leftResult, R rightResult) {
   return null;
 }
 
-public  R in(BoundReference ref, Literal lit) {
+public  R in(BoundReference ref, Set> lits) {
   return null;
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-13 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313259192
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,13 +135,13 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, Set> lits) {
+  return lits.contains(Literals.from(ref.get(struct)));
 
 Review comment:
   Oops, I missed your comment. 
   Yep, you are right. I found this issue and came up with a similar solution.
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-12 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313221838
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java
 ##
 @@ -105,6 +105,16 @@ public static Expression not(Expression child) {
 return new UnboundPredicate<>(Expression.Operation.NOT_EQ, ref(name), 
value);
   }
 
+  public static  UnboundPredicate in(String name, T value, T... values) {
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-12 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313221766
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java
 ##
 @@ -125,13 +132,26 @@ public Expression bind(Types.StructType struct, boolean 
caseSensitive) {
 case LT_EQ:
 case EQ:
   return Expressions.alwaysFalse();
-//case IN:
-//  break;
-//case NOT_IN:
-//  break;
+case IN:
+case NOT_IN:
+  break;
   }
 }
-return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
-schema.accessorForField(field.fieldId())), lit);
+
+switch (op()) {
+  case IN:
+  case NOT_IN:
+@SuppressWarnings("unchecked")
+Set> lits = literalSet()
+.stream()
+.map(l -> (Literal) l.to(field.type()))
+.filter(Objects::nonNull)
+.collect(Collectors.toSet());
+return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-12 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313221698
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Predicate.java
 ##
 @@ -19,15 +19,44 @@
 
 package org.apache.iceberg.expressions;
 
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.util.Collection;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
 public abstract class Predicate implements Expression {
   private final Operation op;
   private final R ref;
   private final Literal literal;
+  private final Set> literalSet;
 
   Predicate(Operation op, R ref, Literal lit) {
 this.op = op;
 this.ref = ref;
 this.literal = lit;
+if (lit == null || lit == Literals.aboveMax() || lit == 
Literals.belowMin() ||
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-12 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313221613
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Predicate.java
 ##
 @@ -19,15 +19,44 @@
 
 package org.apache.iceberg.expressions;
 
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableSet;
+import java.util.Collection;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
 public abstract class Predicate implements Expression {
   private final Operation op;
   private final R ref;
   private final Literal literal;
+  private final Set> literalSet;
 
 Review comment:
   Thanks for the review!
   Yeah, I agree that it is better to have two classes for bound predicate as 
it will only be generated as expression during binding. 
   For `UnboundPredicate` and `Predicate`, I still think it might be good to be 
a single class. For example, `UnboundPredicate` for `IN` predicate with a 
single item should return an `EQ` predicate. The separation will complicate it.
   Please let me know your comments. Thanks.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-12 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313220700
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,13 +135,13 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, Set> lits) {
+  return lits.contains(Literals.from(ref.get(struct)));
 
 Review comment:
   @rdblue  There is an issue with `CharSequence`. For example, `String("abc")` 
and `UTF8("abc")` are not equal but the comparator returns 0. This causes the 
`IN` eval return `false`.
   To fix it, I created a wrapper class and overrides the `equals` to use the 
provided comparator. It worked in our cases although it is not perfect as 
`Set contains B` but `Set may not contain WrapperA`.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-12 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313220319
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java
 ##
 @@ -125,13 +132,26 @@ public Expression bind(Types.StructType struct, boolean 
caseSensitive) {
 case LT_EQ:
 case EQ:
   return Expressions.alwaysFalse();
-//case IN:
-//  break;
-//case NOT_IN:
-//  break;
+case IN:
+case NOT_IN:
+  break;
   }
 }
-return new BoundPredicate<>(op(), new BoundReference<>(field.fieldId(),
-schema.accessorForField(field.fieldId())), lit);
+
+switch (op()) {
+  case IN:
+  case NOT_IN:
+@SuppressWarnings("unchecked")
+Set> lits = literalSet()
+.stream()
+.map(l -> (Literal) l.to(field.type()))
+.filter(Objects::nonNull)
 
 Review comment:
   Thanks for the comments. Fixed it.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-12 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r313220240
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java
 ##
 @@ -105,6 +105,16 @@ public static Expression not(Expression child) {
 return new UnboundPredicate<>(Expression.Operation.NOT_EQ, ref(name), 
value);
   }
 
+  public static  UnboundPredicate in(String name, T value, T... values) {
+Preconditions.checkNotNull(value, "in predicate must include at least one 
value");
 
 Review comment:
   Got it and will add the check for it.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-10 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r312723407
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Predicate.java
 ##
 @@ -62,10 +95,10 @@ public String toString() {
 return String.valueOf(ref()) + " == " + literal();
   case NOT_EQ:
 return String.valueOf(ref()) + " != " + literal();
-//  case IN:
-//break;
-//  case NOT_IN:
-//break;
+  case IN:
+return String.valueOf(ref()) + " in " + literalSet();
 
 Review comment:
   Good point. Will update it accordingly. Thanks.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-10 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r312723380
 
 

 ##
 File path: 
api/src/main/java/org/apache/iceberg/expressions/ExpressionVisitors.java
 ##
 @@ -89,11 +91,11 @@ public R or(R leftResult, R rightResult) {
   return null;
 }
 
-public  R in(BoundReference ref, Literal lit) {
+public  R in(BoundReference ref, Set> lits) {
   return null;
 
 Review comment:
   Yeah, I agree. 
   Although I went through all concrete classes and fixed them (as I changed 
the signature), it is better to throw exception instead of returning `null`.
   It seems better to throw `UnsupportedOperationException` for default 
implementations for all operations. I can make those changes if there is no 
specific reason to `return null`.
   Let me know your comments. Thanks.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-10 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r312720963
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,13 +135,13 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, Set> lits) {
+  return lits.contains(Literals.from(ref.get(struct)));
 
 Review comment:
   Thanks for the comment! 
   Taking a quick look at the literal's `comparator`s and seems all of them are 
`consistent with equals` of the underlying values. 
   So it seems to be safe to just put the values into the `Set` instead of 
putting literals.
   I will make a change to avoid the unnecessary literal cast.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-10 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates
URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r312720963
 
 

 ##
 File path: api/src/main/java/org/apache/iceberg/expressions/Evaluator.java
 ##
 @@ -134,13 +135,13 @@ public Boolean or(Boolean leftResult, Boolean 
rightResult) {
 }
 
 @Override
-public  Boolean in(BoundReference ref, Literal lit) {
-  throw new UnsupportedOperationException("In is not supported yet");
+public  Boolean in(BoundReference ref, Set> lits) {
+  return lits.contains(Literals.from(ref.get(struct)));
 
 Review comment:
   Thanks for the comment! 
   Taking a quick look at the literal's `comparator`s and seems all of them are 
consistent with `equals` of the underlying values. 
   So it seems to be safe to just put the values into the `Set` instead of 
putting literals.
   I will make a change to avoid the unnecessary literal cast.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org