AdamGS commented on code in PR #5181:
URL: https://github.com/apache/arrow-rs/pull/5181#discussion_r1419036780


##########
parquet/src/column/writer/mod.rs:
##########
@@ -764,19 +764,22 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, 
E> {
 
         self.column_metrics.num_column_nulls += 
self.page_metrics.num_page_nulls;
 
-        let page_statistics = match (values_data.min_value, 
values_data.max_value) {
-            (Some(min), Some(max)) => {
-                update_min(&self.descr, &min, &mut 
self.column_metrics.min_column_value);
-                update_max(&self.descr, &max, &mut 
self.column_metrics.max_column_value);
-                Some(ValueStatistics::new(
-                    Some(min),
-                    Some(max),
-                    None,
-                    self.page_metrics.num_page_nulls,
-                    false,
-                ))
-            }
-            _ => None,
+        let page_statistics = if let (Some(min), Some(max)) =

Review Comment:
   There's 
[this](https://github.com/apache/arrow-rs/blob/490c080e5ba7a50efc862da9508e6669900549ee/parquet/src/column/writer/mod.rs#L347)
 branch to calculate chunk statistics directly when `EnabledStatistics::Chunk`. 
I think that just having it as the default (For both `Page` and `Chunk`) will 
probably simplify the code as you don't have to keep track of the chunk-level 
metadata when adding pages, but it might require a bit more work which is why I 
didn't end up going that way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to