aokolnychyi opened a new pull request #1664:
URL: https://github.com/apache/iceberg/pull/1664


   This PR optimizes the evaluation of IN predicates on dictionary encoded 
columns in Parquet.
   
   The previous solution relied on `isEmpty` on top of `Sets$intersection`. 
That, in turn, used `Collections$disjoint(set2, set1)`. The latter checks 
whether the first argument is a set or not. If yes, it would simply iterate 
over the second argument ignoring the fact that the second argument can be also 
a set and may be even bigger. All of that led to the fact that we iterated 
through all dictionary values to evaluate IN predicates.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to