alamb opened a new pull request, #3828: URL: https://github.com/apache/arrow-datafusion/pull/3828
Draft until - [ ] https://github.com/apache/arrow-datafusion/pull/3822 is merged - [ ] We have completed testing / validation - # Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/3463 re https://github.com/apache/arrow-datafusion/issues/3462 # Rationale for this change This PR turns on parquet scan predicate pushdown (see https://github.com/apache/arrow-datafusion/issues/3462) by default -- I am putting it up early as part of the testing process (so we can work through any issues it may uncover) This feature promises to be one of the most significant performance improvements for DataFusion reading from parquet in a while. All the hard work was done by @Ted-Jiang @thinkharderdev and @tustvold # What changes are included in this PR? Enable pushing filters into the scan directly Note this feature can be disabled by setting the `datafusion.execution.parquet.pushdown_filters` configuration setting to false. # Are there any user-facing changes? Hopefully faster performance -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
