rspears74 opened a new issue, #8906: URL: https://github.com/apache/arrow-datafusion/issues/8906
### Is your feature request related to a problem or challenge? As far as I can tell, there is no good way to load a subset of files from a partitioned table. Using `ListingTable` or another `TableProvider` like `DeltaTableProvider` from `deltalake`, I'm able to `read_table`, but this loads the entire table. I can also load a list of parquet files with `read_parquet`, but this doesn't work with partitioned tables if the partitions are not "materialized" columns in the raw parquet. The only way I've found to load partitioned files is by iterating over a list of file paths, and doing the entire `TableProvider`/`read_table` process on each one individually, and `union`ing the results together. ### Describe the solution you'd like It seems like it would be nice to be able to create a `TableProvider` with a table path, then pass some sort of file "whitelist" in. Maybe a `read_table_files(TableProvider, impl IntoIterator<Item = String>)`. ### Describe alternatives you've considered As stated above, I've tried reading the files one-by-one and `union`ing results, but it's shockingly inefficient compared to reading all files at once. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org