alamb opened a new issue, #4177: URL: https://github.com/apache/arrow-datafusion/issues/4177
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Suggested by @crepererum in https://github.com/apache/arrow-datafusion/issues/4169#issuecomment-1311347788 Some systems such as IOx, store parquet files in a particular sorted order, and then uses the fact the data is sorted for a variety of sort related optimizations. The `BasicEnforcement` rule added in https://github.com/apache/arrow-datafusion/pull/4122 by @mingmwang allows DataFusion to take advantage of known information about the sort order. One contrived example is if your parquet file is sorted by `price` and your query is `select * from data order by price limit 10` datafusion can avoid scanning the entire file Another more interesting example could be using sorted order to reorder pushdown filters or using a sort-merge-join without actually sorting **Describe the solution you'd like** - [ ] https://github.com/apache/arrow-rs/issues/3090 - [ ] Detect and use this sorted information when creating a ListingTable that reads from parquet files **Describe alternatives you've considered** Don't do it **Additional context** Here is a ticket that tracks allowing users of DataFusion to manually specify the sort order: https://github.com/apache/arrow-datafusion/issues/4169 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
