alamb commented on issue #8609: URL: https://github.com/apache/arrow-datafusion/issues/8609#issuecomment-1868055470
I think a more elegant solution would be to implement direct support in pruning for large `IN` lists -- the parameter you refer to is effectively rewriting such predicates into OR chains so the existing min/max based evaluation can work on them. A config parameter is probably fine for the near term. We have been recently improving the code in this area -- see https://github.com/apache/arrow-datafusion/pull/8440 for example. Maybe we can update the PruningPredicate logic to use the `contained` api more to rule out containers based on their min/max values Specifically, we could figure out the min and max values in the list for contains and then compare the actual min/max values in the columns 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org