alamb commented on code in PR #6117: URL: https://github.com/apache/arrow-rs/pull/6117#discussion_r1692981773
########## parquet/src/file/metadata/mod.rs: ########## @@ -887,6 +887,7 @@ impl ColumnChunkMetaDataBuilder { } /// Sets file offset in bytes. Review Comment: ```suggestion /// Sets file offset in bytes. /// /// This field was meant to provide an alternate to storing `ColumnMetadata` directly in /// the `ColumnChunkMetadata`. However, most parquet readers assume the `ColumnMetadata` /// is stored inline and ignore this field. ``` ########## parquet/src/column/writer/mod.rs: ########## @@ -1023,8 +1017,6 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, E> { } let metadata = builder.build()?; - self.page_writer.write_metadata(&metadata)?; Review Comment: This is the major functional change I think -- I expect it would result in slightly smaller parquet files as the per column metadata is no longer written twice -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org