rkrishn7 commented on issue #17523:
URL: https://github.com/apache/datafusion/issues/17523#issuecomment-3282943112

   This is likely related to 
https://github.com/apache/datafusion/issues/17171#issuecomment-3282832934!
   
   The discussion there is around passing the build side hash table to the 
probe side for row filtering. Though I do think there is also benefit to 
pushing down an expression like an `IN LIST` that can be used for stats-based 
filtering (e.g. to filter out entire row groups). 
   
   > I understand we should put a limit on how big this List can get, maybe 
make this configurable by the users though an option and use a safe default 
limit.
   
   Yeah I did a bit of testing around this and things get pretty slow if the 
list is large. Not sure what a good default would be, but I imagine something 
pretty small 🤔 . I think it can work well when we have a very small build side


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to