alamb commented on code in PR #6181:
URL: https://github.com/apache/arrow-rs/pull/6181#discussion_r1701923306


##########
parquet/src/arrow/arrow_reader/statistics.rs:
##########
@@ -943,6 +973,41 @@ macro_rules! get_data_page_statistics {
                     }
                     Ok(Arc::new(builder.finish()))
                 },
+                Some(DataType::Utf8View) => {
+                    let mut builder = StringViewBuilder::new();
+                    let iterator = [<$stat_type_prefix 
ByteArrayDataPageStatsIterator>]::new($iterator);
+                    for x in iterator {
+                        for x in x.into_iter() {
+                            let Some(x) = x else {
+                                builder.append_null(); // no statistics value
+                                continue;
+                            };
+
+                            let Ok(x) = std::str::from_utf8(x.data()) else {
+                                builder.append_null();
+                                continue;
+                            };
+
+                            builder.append_value(x);
+                        }
+                    }
+                    Ok(Arc::new(builder.finish()))
+                },
+                Some(DataType::BinaryView) => {
+                    let mut builder = BinaryViewBuilder::new();
+                    let iterator = [<$stat_type_prefix 
ByteArrayDataPageStatsIterator>]::new($iterator);
+                    for x in iterator {
+                        for x in x.into_iter() {
+                            let Some(x) = x else {
+                                builder.append_null(); // no statistics value
+                                continue;
+                            };
+
+                            builder.append_value(x);
+                        }
+                    }
+                    Ok(Arc::new(builder.finish()))
+                },
                 _ => unimplemented!()

Review Comment:
   Can you please also change this catch all / panic to 
   1. Match all missing data types (will make it explicit what types are 
handled and which are not)
   2. returning a null array for non implemented types
   
   
   Basically make it follow the model of `get_statistics`: 
https://github.com/apache/arrow-rs/blob/01407f4824fa4eb656398558183c7cd01537246e/parquet/src/arrow/arrow_reader/statistics.rs#L451-L468



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to