tustvold commented on code in PR #2022:
URL: https://github.com/apache/arrow-rs/pull/2022#discussion_r916873607
##########
parquet/src/file/properties.rs:
##########
@@ -394,23 +376,33 @@ impl WriterPropertiesBuilder {
/// Sets flag to enable/disable statistics for a column.
/// Takes precedence over globally defined settings.
- pub fn set_column_statistics_enabled(mut self, col: ColumnPath, value:
bool) -> Self {
- self.get_mut_props(col).set_statistics_enabled(value);
- self
- }
-
- /// Sets max size for statistics for a column.
- /// Takes precedence over globally defined settings.
- pub fn set_column_max_statistics_size(
+ pub fn set_column_statistics_enabled(
mut self,
col: ColumnPath,
- value: usize,
+ value: EnabledStatistics,
) -> Self {
- self.get_mut_props(col).set_max_statistics_size(value);
+ self.get_mut_props(col).set_statistics_enabled(value);
self
}
}
+/// Controls the level of statistics to be computed by the writer
+#[derive(Debug, Clone, Copy, Eq, PartialEq)]
+pub enum EnabledStatistics {
Review Comment:
Glad one of us likes it :laughing:
FWIW at least in the case of the column index, parquet allows using
truncated values in statistics. e.g. instead of a min of `"lorem ipsum ..."`
you could just store `"lorem"`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]