alamb commented on code in PR #8098: URL: https://github.com/apache/arrow-rs/pull/8098#discussion_r2274224852
########## parquet/src/file/properties.rs: ########## @@ -311,15 +310,19 @@ impl WriterProperties { self.column_index_truncate_length } - /// Returns the maximum length of truncated min/max values in [`Statistics`]. + /// Returns the maximum length of truncated min/max values in [`Statistics`] for a specific column. /// /// `None` if truncation is disabled, must be greater than 0 otherwise. /// - /// For more details see [`WriterPropertiesBuilder::set_statistics_truncate_length`] + /// For more details see [`WriterPropertiesBuilder::set_column_statistics_truncate_length`] /// /// [`Statistics`]: crate::file::statistics::Statistics - pub fn statistics_truncate_length(&self) -> Option<usize> { - self.statistics_truncate_length + pub fn statistics_truncate_length(&self, col: &ColumnPath) -> Option<usize> { Review Comment: This is a [public API](https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterProperties.html#method.statistics_truncate_length) change, so it means we wouldn't be able to release it until the next major release in a few months (Oct 2025): - https://github.com/apache/arrow-rs/issues/7835 To avoid the public API change, we could make a new function like `column_chunk_statistics_truncate_length` for example, and deprecate the old one as described [here](https://github.com/apache/arrow-rs?tab=readme-ov-file#deprecation-guidelines) ########## parquet/src/file/properties.rs: ########## @@ -1155,6 +1177,11 @@ impl ColumnProperties { fn bloom_filter_properties(&self) -> Option<&BloomFilterProperties> { self.bloom_filter_properties.as_ref() } + + /// Returns the statistics truncate length for this column. + fn statistics_truncate_length(&self) -> Option<Option<usize>> { Review Comment: Why is this an `Option<Option<size>>`? What does the second level of option represent? Can we please document what that second level means if it is deliberate -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org