thinkharderdev commented on PR #3380: URL: https://github.com/apache/arrow-datafusion/pull/3380#issuecomment-1243863491
> I think this PR can be merged in as is and we can keep iterating on it subsequently. > > Really nice @thinkharderdev and kudos to the rest of the predicate pushdown team @Ted-Jiang and @tustvold (sorry if I have forgotten others) > > It would be nice to file a follow on issue (I can do so if you like) listing the steps we felt were necessary to turn on predicate pushdown by default. In my mind this is primarily about testing (functional and performance). Thanks! I can clean up the last few things mentioned in your review this afternoon/evening. I'll also create a followup ticket with remaining items. In general I think the follow-ons are: 1. Add some proper benchmarks 2. Once we are on arrow-rs 23 (I think) and using the offset index is stabilized we should allow users to pass that down in the `ParquetScanOption`. 3. Once we are all comfortable with it, turn filter pushdown on by default. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
