Matthieusalor commented on issue #6564: URL: https://github.com/apache/iceberg/issues/6564#issuecomment-1491632140
Regarding statistics of written files, the write_to_dataset and write_dataset functions of pyarrow are providing a file_visitor argument that allows to retrieve the path and metadata of each written file. The metadata object is https://arrow.apache.org/docs/python/generated/pyarrow.parquet.FileMetaData.html It allows to retrieve each row group statistics https://arrow.apache.org/docs/python/generated/pyarrow.parquet.RowGroupMetaData.html#pyarrow.parquet.RowGroupMetaData -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
