guiyanakuang commented on code in PR #1692:
URL: https://github.com/apache/orc/pull/1692#discussion_r1434659765


##########
java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java:
##########
@@ -747,8 +752,10 @@ static TruthValue evaluatePredicateRange(PredicateLeaf 
predicate,
                                            ValueRange range,
                                            BloomFilter bloomFilter,
                                            boolean useUTCTimestamp) {
+    // If range is invalid, that means that no value (including null) is 
written to this column
+    // we should return TruthValue.NO for any predicate.
     if (!range.isValid()) {
-      return TruthValue.YES_NO_NULL;
+      return TruthValue.NO;

Review Comment:
   @wgtmac , Thanks for the review, but I think the change here is the key to 
fixing the bug.
   The schema in the unit test is `struct<x:struct<z:double>,y:int>`
   I've constructed 1024 rows of data, and `x` is null. This means that the 
statistics for `x.z` are kept in the initial state `numberOfValues = 0 hasNull 
= false`. 
   
   Any filtering of `x.z` at this point should return `TruthValue.NO`
   
   Here's the enumeration operator: for data that doesn't exist (which 
distinguishes it from having a value and being Null) I think it's valid to 
return `TruthValue.NO`. If there is a counterexample, I'd be happy to modify it.
   
   ```java
   public interface PredicateLeaf {
   
     /**
      * The possible operators for predicates. To get the opposites, construct
      * an expression with a not operator.
      */
     public static enum Operator {
       EQUALS,
       NULL_SAFE_EQUALS,
       LESS_THAN,
       LESS_THAN_EQUALS,
       IN,
       BETWEEN,
       IS_NULL
     }
     ....
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to