alamb commented on code in PR #5786: URL: https://github.com/apache/arrow-rs/pull/5786#discussion_r1606638612
########## parquet/src/file/metadata.rs: ########## @@ -72,10 +69,27 @@ pub type ParquetColumnIndex = Vec<Vec<Index>>; /// parquet file. pub type ParquetOffsetIndex = Vec<Vec<Vec<PageLocation>>>; -/// Global Parquet metadata. +/// Global Parquet metadata, including [`FileMetaData`], [`RowGroupMetaData`]. +/// +/// This structure is stored in the footer of Parquet files, in the format +/// defined by [`parquet.thrift`]. It contains: +/// +/// * File level metadata: [`FileMetaData`] +/// * Row Group level metadata: [`RowGroupMetaData`] +/// * (Optional) "Page Index": [`ParquetColumnIndex`] Review Comment: BTW I have filed a PR in https://github.com/apache/parquet-format/pull/245 to try and clarify this (fun fact -- the actual spec / format do not use the term "page index" anywhere except the file name of the document that describes the ColumnIndex) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
