isidentical opened a new issue, #3774:
URL: https://github.com/apache/arrow-datafusion/issues/3774

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   #1347 enabled collection of statistics by default on the `ListingOptions` 
constructor, though the tables created with `CREATE EXTERNAL TABLE` can't still 
use this feature since they are created manually.
   
https://github.com/apache/arrow-datafusion/blob/e54110fb592e03704da5f6ebd832b8fe1c51123b/datafusion/core/src/execution/context.rs#L486-L488
   
   **Describe the solution you'd like**
   We already have a per file extension listing option implementation for the 
`read_` dataframe APIs (e.g. `CsvReadOptions`, `ParquetReadOptions`) and they 
have sane defaults (like `collect_stats` is `false` for CSV and `true` for 
Parquet). I wonder whether we can just use them here and obtain the 
`ListingOptions` directly from them.
   
   **Describe alternatives you've considered**
   Leaving as is, or enabling them globally (instead of refactoring that part 
to use `ReadOptions`) by just setting the flag to true.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to