rkrishn7 commented on issue #17523: URL: https://github.com/apache/datafusion/issues/17523#issuecomment-3282943112
This is likely related to https://github.com/apache/datafusion/issues/17171#issuecomment-3282832934! The discussion there is around passing the build side hash table to the probe side for row filtering. Though I do think there is also benefit to pushing down an expression like an `IN LIST` that can be used for stats-based filtering (e.g. to filter out entire row groups). > I understand we should put a limit on how big this List can get, maybe make this configurable by the users though an option and use a safe default limit. Yeah I did a bit of testing around this and things get pretty slow if the list is large. Not sure what a good default would be, but I imagine something pretty small 🤔 . I think it can work well when we have a very small build side -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org