scovich commented on code in PR #8600:
URL: https://github.com/apache/arrow-rs/pull/8600#discussion_r2445833163


##########
arrow-array/src/builder/generic_bytes_builder.rs:
##########
@@ -348,6 +348,40 @@ impl<O: OffsetSizeTrait> std::fmt::Write for 
GenericStringBuilder<O> {
     }
 }
 
+const AVERAGE_STRING_LENGTH: usize = 16;
+/// Trait for string-like array builders
+///
+/// This trait provides unified interface for builders that append string-like 
data
+/// such as [`GenericStringBuilder<O>`] and 
[`crate::builder::StringViewBuilder`]
+pub trait StringLikeArrayBuilder: ArrayBuilder {

Review Comment:
   nit: does it actually need to be pub?



##########
parquet-variant-compute/src/variant_array.rs:
##########
@@ -844,6 +947,16 @@ fn typed_value_to_variant<'a>(
             let value = array.value(index);
             Variant::from(value)
         }
+        DataType::LargeUtf8 => {
+            let array = typed_value.as_string::<i64>();
+            let value = array.value(index);
+            Variant::from(value)
+        }
+        DataType::Utf8View => {
+            let array = typed_value.as_string_view();
+            let value = array.value(index);
+            Variant::from(value)
+        }

Review Comment:
   aside: there's a lot of duplication here (both new and existing code). 
Should we consider tracking a follow-up item to introduce either a trait or a 
macro that abstracts away the boilerplate?



##########
arrow-array/src/builder/generic_bytes_builder.rs:
##########
@@ -348,6 +348,40 @@ impl<O: OffsetSizeTrait> std::fmt::Write for 
GenericStringBuilder<O> {
     }
 }
 
+const AVERAGE_STRING_LENGTH: usize = 16;
+/// Trait for string-like array builders
+///
+/// This trait provides unified interface for builders that append string-like 
data
+/// such as [`GenericStringBuilder<O>`] and 
[`crate::builder::StringViewBuilder`]
+pub trait StringLikeArrayBuilder: ArrayBuilder {

Review Comment:
   update: I just realized -- this is making a public API change to arrow-array 
(not isolated to variant crate). 
   
   I'm fine with that, in principle, but we should make sure it's a very 
intentional change? 
   In particular, it's a one-way door to make this public, but a two-way door 
to make it variant-only at first.
   
   CC @alamb 



##########
parquet-variant-compute/src/variant_to_arrow.rs:
##########
@@ -413,6 +432,13 @@ macro_rules! define_variant_to_primitive_builder {
     }
 }
 
+define_variant_to_primitive_builder!(
+    struct VariantToStringArrowBuilder<'a, B: StringLikeArrayBuilder>
+    |capacity| -> B { B::with_capacity(capacity) },
+    |value| value.as_string(),
+    type_name: B::type_name()
+);

Review Comment:
   🎉 



##########
parquet-variant-compute/src/variant_to_arrow.rs:
##########
@@ -67,7 +71,6 @@ pub(crate) enum PrimitiveVariantToArrowRowBuilder<'a> {
 pub(crate) enum VariantToArrowRowBuilder<'a> {
     Primitive(PrimitiveVariantToArrowRowBuilder<'a>),
     BinaryVariant(VariantToBinaryVariantArrowRowBuilder),
-

Review Comment:
   intentional change? or noise?



##########
arrow-array/src/builder/generic_bytes_builder.rs:
##########
@@ -348,6 +348,40 @@ impl<O: OffsetSizeTrait> std::fmt::Write for 
GenericStringBuilder<O> {
     }
 }
 
+const AVERAGE_STRING_LENGTH: usize = 16;
+/// Trait for string-like array builders
+///
+/// This trait provides unified interface for builders that append string-like 
data
+/// such as [`GenericStringBuilder<O>`] and 
[`crate::builder::StringViewBuilder`]
+pub trait StringLikeArrayBuilder: ArrayBuilder {
+    /// Returns a human-readable type name for the builder.
+    fn type_name() -> &'static str;
+
+    /// Creates a new builder with the given row capacity.
+    fn with_capacity(capacity: usize) -> Self;
+
+    /// Appends a non-null string value to the builder.
+    fn append_value(&mut self, value: &str);
+
+    /// Appends a null value to the builder.
+    fn append_null(&mut self);
+}
+
+impl<O: OffsetSizeTrait> StringLikeArrayBuilder for GenericStringBuilder<O> {
+    fn type_name() -> &'static str {
+        std::any::type_name::<GenericStringBuilder<O>>()

Review Comment:
   I _believe_ we can simplify these:
   ```suggestion
           std::any::type_name::<Self>()
   ```
   (sadly, it can't be a provided method... but at least it can be simpler?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to