joyhaldar opened a new pull request, #14593:
URL: https://github.com/apache/iceberg/pull/14593

   ## Summary
   This PR adds file pruning optimization for `NOT IN` and `!=` predicates when 
a file contains a single distinct value (i.e., when `min == max`).
   
   ## Problem
   Currently, 
[InclusiveMetricsEvaluator](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/expressions/InclusiveMetricsEvaluator.java)
 cannot prune files for `NOT IN` and `!=` predicates, even when the file 
provably contains no matching rows.
   
   ## Solution
   When `min == max` and the file has no nulls, we can safely prune if:
   - For `NOT IN`: the single value is in the exclusion list
   - For `!=`: the single value equals the literal
   
   ## Testing
   - Added unit tests for both `notIn` and `notEq` optimizations
   - Verified correct behavior with nulls (must scan) and without nulls (can 
prune)
   
   Fixes #14592


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to