alamb opened a new issue, #3147: URL: https://github.com/apache/arrow-datafusion/issues/3147
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The more filtering that can be pushed to the parquet reading, the faster a query will run in general as less work is needed to decode and process data that will eventually be filtered from the plan There are several ongoing workstreams that will eventually lead to pushing down substantial additional filtering into the parquet scan that should substantially increase performance for datafusion. I wanted to capture them here to provide more visibility cc @Ted-Jiang @tustvold @thinkharderdev **Describe the solution you'd like** Here are some of the tasks I have collected. There are likely more -- please add them (either directly or via comments) - [ ] https://github.com/apache/arrow-datafusion/pull/2677 - [ ] https://github.com/apache/arrow-rs/issues/1191 - [ ] https://github.com/apache/arrow-rs/issues/2270 - [ ] https://github.com/apache/arrow-datafusion/issues/847 - [ ] Write a blog post on the topic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
