Ted-Jiang commented on issue #3360: URL: https://github.com/apache/arrow-datafusion/issues/3360#issuecomment-1236968574
@thinkharderdev Wow! So looking forward! 💪 > * I think it must be possible to control what predicates get pushed down the scan, an expensive predicate may still make sense as a row group filter but not a row filter > * We could restrict the pushed down predicates to simple binary predicates on dictionary or primitive columns by default > * We should make visible in the explain plan what is being pushed down to what level > * We could use the sort order if any to inform the push down order > * We need benchmarks, lots of benchmarks 😆 Nice write up! Thanks👍 I think one thing we should talk about , how to define the `non-selective predicates (expensive predicate)`. I think for now if we want to check wether is a predicate selective on no-sorted col , we need know the the result page number, so we need read `col-index`. if we filter zero page, it will run slower than before.🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
