tustvold commented on issue #5466: URL: https://github.com/apache/arrow-datafusion/issues/5466#issuecomment-1471818016
> get metadata_size_hint value in infer_schema and infer_stats functions of ParquetFormat. I think it is worth drawing a distinction between the logic for catalog-inference, i.e. ListingTable, from that of query processing, i.e. FileScanConfig. Most practical applications will need a `TableProvider` backed by some sort of catalog for reasonable performance, and this would be an ideal place to store information such as the footer size, schema, statistics, etc... and this can be used to populate `FileScanConfig` accurately. For `TableProvider` that don't have access to this information, such as `ListingTable`, I think it is perfectly acceptable to use a single config value for the metadata size hint -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
