thinkharderdev commented on issue #11442: URL: https://github.com/apache/datafusion/issues/11442#issuecomment-2228197954
> ## Complete Parquet Filter Performance > * [Enable parquet filter pushdown by default #3463](https://github.com/apache/datafusion/issues/3463) > > **What**: Enable the most advanced form of predicate pushdown / late materialization that DataFusion **Why**: Influx enables this and it helps with many of our queries. I think Coralogix uses it too (maybe @Dandandan or @thinkharderdev could correct me) **What is left**: The actual code is straight forward (change a default config value). The hard part is that last time we ran benchmarks this option actually made some queries slower. So the work is to help debug / profile / figure out why and then what changes are needed to ensure performance doesn't slow down. There are some ideas : > > * [Adaptive Parquet Predicate Pushdown arrow-rs#5523](https://github.com/apache/arrow-rs/issues/5523) > * [Return TableProviderFilterPushDown::Exact when Parquet Pushdown Enabled #4028](https://github.com/apache/datafusion/issues/4028)) Yeah, we use it as well. We have some custom code to decide when to push down predicates and in general its a pretty tricky thing to get right. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org