tustvold commented on issue #3463: URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708264117
> Has there been any attempts to keep track of filter selectivity and use that to our advantage? For example we could track filter selectivity for each filter and use that to: The new parquet pushdown sort of does this IIUC, but at the physical execution level - i.e. after the IO strategy is somewhat baked in. I definitely think DataFusion should be using selectivity estimates at planning time to determine which predicates make sense to pushdown. The parquet crate can't do this as it has no notion of what the predicates actually are, it intentionally does not bundle its own expression framework, but I believe DF already has the machinery to handle this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
