alamb commented on PR #4908: URL: https://github.com/apache/arrow-datafusion/pull/4908#issuecomment-1397083616
> What do you think about having a single method which only takes a list of paths? For a single path, the callee can create a slice/Vec. This would be a lot simpler to do. I was thinking about this PR and I have an alternate suggestion It seems to me that `read_parquet`, `read_avro`, etc are wrappers to simplify the process of creating a `ListingTable`. Support for multiple paths starts complicating the API more -- what do you think about instead of adding `read_parquet_from_path`s we make it easier to see how to read multiple files using the `ListingTable` API directly? For example, I bet if we added a doc example like the following ```rust /// Creates a [`DataFrame`] for reading a Parquet data source from a single file or directory. /// /// Note: if you want to read from multiple files, or control other behaviors /// you can use the [`ListingTable`] API directly. For example to read multiple files /// /// ``` /// Example here (basically copy/paste the implementation of read_parquet and support multiple files) /// ``` pub async fn read_parquet( &self, table_path: impl AsRef<str>, options: ParquetReadOptions<'_>, ) -> Result<DataFrame> { ... ``` We could give similar treatment to the docstrings for `read_avro` and `read_csv` (perhaps by pointing to the docs for `read_parquet` for an example of creating `ListingTable`s) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org