alamb commented on issue #15191: URL: https://github.com/apache/datafusion/issues/15191#issuecomment-2730420780
> I don't mean to tout my own horn too much, but in fact this exact use case is what [FileScanConfig::split_groups_by_statistics](https://github.com/apache/datafusion/blob/main/datafusion/datasource/src/file_scan_config.rs#L569) was written to solve. We can solve the problem locally at the DataSourceExec, which I think is the right place to do it. It's still gated behind a feature flag, I have not been able to dedicate the time to set up benchmarks for ListingTable which I think is required to take this feature out of being experimental and ship it. I agree -- in my mind this is all related -- when trying to take maximum advantage of pre-existing orderings I do think the optimizer should be more careful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org