etseidl commented on code in PR #8098: URL: https://github.com/apache/arrow-rs/pull/8098#discussion_r2282665984
########## parquet/src/file/properties.rs: ########## @@ -665,7 +667,7 @@ impl WriterPropertiesBuilder { assert!(value > 0, "Cannot have a 0 statistics truncate length. If you wish to disable min/max value truncation, set it to `None`."); } - self.statistics_truncate_length = max_length; + self.default_column_properties.statistics_truncate_length = Some(max_length); Review Comment: You should also add a setter to the column properties impl rather than assigning it directly here. ########## parquet/src/file/properties.rs: ########## @@ -1155,6 +1177,11 @@ impl ColumnProperties { fn bloom_filter_properties(&self) -> Option<&BloomFilterProperties> { self.bloom_filter_properties.as_ref() } + + /// Returns the statistics truncate length for this column. + fn statistics_truncate_length(&self) -> Option<Option<usize>> { Review Comment: This makes my head hurt 😅. I assume the fear here is that the `Default` implementation for `ColumnProperties` won't set the truncation length to the default value, but will instead leave it `None` in the case that a different property is set for the column. So the double layer of option is to distinguish between `None` because that's what the user set vs `None` because the option hasn't been set. Perhaps we should just implement a sensible `Default` for `ColumnProperties?` ########## parquet/src/file/properties.rs: ########## @@ -665,7 +667,7 @@ impl WriterPropertiesBuilder { assert!(value > 0, "Cannot have a 0 statistics truncate length. If you wish to disable min/max value truncation, set it to `None`."); } - self.statistics_truncate_length = max_length; + self.default_column_properties.statistics_truncate_length = Some(max_length); Review Comment: Can this be moved down below line 709 so it's with the other per-column properties? ``` // Setters for any column (global) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org