alamb commented on issue #8609:
URL: 
https://github.com/apache/arrow-datafusion/issues/8609#issuecomment-1868055470

   I think a more elegant solution would be to implement direct support in 
pruning for large `IN` lists -- the parameter you refer to is effectively 
rewriting such predicates into OR chains so the existing min/max based 
evaluation can work on them.
   
   A config parameter is probably fine for the near term. 
   
   We have been recently improving the code in this area -- see 
https://github.com/apache/arrow-datafusion/pull/8440 for example. Maybe we can 
update the PruningPredicate logic to use the `contained` api more to rule out 
containers based on their min/max values
   
   Specifically, we could figure out the min and max values in the list for 
contains and then compare the actual min/max values in the columns 🤔 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to