etseidl opened a new issue, #6115: URL: https://github.com/apache/arrow-rs/issues/6115
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The writing of the thrift `ColumnMetaData` outside of the Parquet file footer was recently deprecated (https://github.com/apache/parquet-format/pull/440), as was the setting of the `ColumnChunk::file_offset` field. Also, the `ColumnMetaData` currently written has incorrect values for `dictionary_page_offset` and `data_page_offset` (they are relative to the start of the chunk rather than being offset to their location in the file). **Describe the solution you'd like** The current Parquet [spec](https://github.com/apache/parquet-format/blob/5a5c8948e60770f8a8356a8f5e616d5ae1079d4b/src/main/thrift/parquet.thrift#L870-L878) indicates the `file_offset` field should be set to 0, and `ColumnMetaData` should no longer be written inline with the data. **Describe alternatives you've considered** If not removed, the offsets mentioned above should be set to correct values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
