wgtmac commented on code in PR #2941:
URL: https://github.com/apache/parquet-java/pull/2941#discussion_r1667871679


##########
parquet-column/src/main/java/org/apache/parquet/filter2/predicate/Operators.java:
##########
@@ -345,11 +374,14 @@ public <R> R accept(Visitor<R> visitor) {
     }
 
     /**
-     * Applies a filtering Vistitor to the Contains predicate, traversing any 
composed And or Or clauses,
-     * and finally delegating to the underlying ColumnFilterPredicate.
+     * Applies a filtering Visitor to the Contains predicate, traversing any 
composed And or Or clauses,
+     * and finally delegating to the underlying column predicate.
      */
     public abstract <R> R filter(
-        Visitor<R> visitor, BiFunction<R, R, R> andBehavior, BiFunction<R, R, 
R> orBehavior);
+        Visitor<R> visitor,
+        BiFunction<R, R, R> andBehavior,
+        BiFunction<R, R, R> orBehavior,
+        Function<R, R> notBehavior);

Review Comment:
   IIUC, all kinds of filters return all rows if `notBehavior` is found, right?



##########
parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java:
##########
@@ -371,7 +371,11 @@ public <T extends Comparable<T>> PrimitiveIterator.OfInt 
visit(NotIn<T> notIn) {
 
     @Override
     public <T extends Comparable<T>> PrimitiveIterator.OfInt visit(Contains<T> 
contains) {
-      return contains.filter(this, IndexIterator::intersection, 
IndexIterator::union);
+      return contains.filter(

Review Comment:
   Yes, we cannot make such subtraction unless it is a single-value page which 
has only `bar`.



##########
parquet-column/src/main/java/org/apache/parquet/filter2/recordlevel/IncrementallyUpdatedFilterPredicate.java:
##########
@@ -145,6 +146,83 @@ public boolean accept(Visitor visitor) {
     }
   }
 
+  /**
+   * A ValueInspector implementation that keeps state for one or more delegate 
inspectors.
+   */
+  abstract static class DelegatingValueInspector extends ValueInspector {

Review Comment:
   `IncrementallyUpdatedFilterPredicateGenerator.java` is pretty difficult to 
read and review. Since this is only applied to `ContainsPredicate` and does not 
affect others, I'm fine to make this change.



##########
parquet-hadoop/src/main/java/org/apache/parquet/filter2/statisticslevel/StatisticsFilter.java:
##########
@@ -214,7 +214,7 @@ public <T extends Comparable<T>> Boolean visit(NotIn<T> 
notIn) {
 
   @Override
   public <T extends Comparable<T>> Boolean visit(Contains<T> contains) {
-    return contains.filter(this, (l, r) -> l || r, (l, r) -> l && r);
+    return contains.filter(this, (l, r) -> l || r, (l, r) -> l && r, b -> 
BLOCK_MIGHT_MATCH);

Review Comment:
   ```suggestion
       return contains.filter(this, (l, r) -> l || r, (l, r) -> l && r, v -> 
BLOCK_MIGHT_MATCH);
   ```
   
   Just want to be consistent with `BloomFilterImpl` and `DictionaryFilter`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to