tustvold commented on issue #5466:
URL: 
https://github.com/apache/arrow-datafusion/issues/5466#issuecomment-1471818016

   > get metadata_size_hint value in infer_schema and infer_stats functions of 
ParquetFormat.
   
   I think it is worth drawing a distinction between the logic for 
catalog-inference, i.e. ListingTable, from that of query processing, i.e. 
FileScanConfig. Most practical applications will need a `TableProvider` backed 
by some sort of catalog for reasonable performance, and this would be an ideal 
place to store information such as the footer size, schema, statistics, etc... 
and this can be used to populate `FileScanConfig` accurately. 
   
   For `TableProvider` that don't have access to this information, such as 
`ListingTable`, I think it is perfectly acceptable to use a single config value 
for the metadata size hint


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to