alamb commented on issue #15177: URL: https://github.com/apache/datafusion/issues/15177#issuecomment-2717635254
Note that late materialization (the join / semi join rewrite) needs join operator support that DataFusion doesn't yet have (we could add it but it will take non trivial effort) My suggested order of implementation is: 1. https://github.com/apache/datafusion/issues/3463 with @XiangpengHao (so that we can actually evaluate the topk filter during scan) 2. Then implement topk filtering https://github.com/apache/datafusion/issues/15037 I actually think that will likely get us quite fast. I am not sure how much more improvement late materialized joins will get without a specialized file format. I don't have time to help plan out late materializing joins at the moment, but I am quite interested in pushing along the predicate pushdown -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org