alamb commented on code in PR #9959:
URL: https://github.com/apache/arrow-rs/pull/9959#discussion_r3244351606


##########
arrow-array/src/array/run_array.rs:
##########
@@ -245,15 +245,22 @@ impl<R: RunEndIndexType> RunArray<R> {
     /// assert_eq!(new_run_array.run_ends().values(), &[2, 3, 5]);
     /// ```
     pub fn with_values(&self, values: ArrayRef) -> Self {
-        assert_eq!(values.len(), self.values().len());
+        assert_eq!(values.len(), self.values.len());
         let (run_ends_field, values_field) = match &self.data_type {
-            DataType::RunEndEncoded(r, v) => (r, v),
+            DataType::RunEndEncoded(r, v) => (
+                r,
+                Arc::new(
+                    v.as_ref()
+                        .clone()
+                        .with_data_type(values.data_type().clone()),
+                ),
+            ),
             _ => unreachable!("RunArray should have type RunEndEncoded"),
         };
-        let data_type =
-            DataType::RunEndEncoded(Arc::clone(run_ends_field), 
Arc::clone(values_field));
+        let dt = DataType::RunEndEncoded(Arc::clone(run_ends_field), 
Arc::clone(&values_field));

Review Comment:
   I think you could just do this (no need to clone the new value)
   
   ```rust
           let dt = DataType::RunEndEncoded(Arc::clone(run_ends_field), 
values_field);
   ```
   
   Also if you kept the `data_type` name it would keep the line below cleaner 
too
   
   ```rust
           let data_type = ...
   ```



##########
arrow-array/src/array/run_array.rs:
##########
@@ -781,6 +788,41 @@ where
         RunArrayIter::new(self)
     }
 }
+/// An array that can be downcast to a [`RunArray`] of any run end type and 
any value type.
+///
+/// This can be used to efficiently implement kernels for all possible run end
+/// types without needing to create specialized implementations for each key 
type.
+pub trait AnyRunEndArray: Array {
+    /// Returns the run ends of this array as a primitive array.
+    fn run_ends(&self) -> ArrayRef;
+
+    /// Returns the values of this array.
+    fn values(&self) -> &Arc<dyn Array>;
+
+    /// Returns a new run-end encoded array with the given values, preserving 
the
+    /// existing run ends.
+    fn with_values(&self, values: ArrayRef) -> ArrayRef;
+}
+
+impl<R: RunEndIndexType> AnyRunEndArray for RunArray<R> {
+    fn run_ends(&self) -> ArrayRef {
+        let data = unsafe {
+            ArrayDataBuilder::new(R::DATA_TYPE)
+                .len(self.run_ends.values().len())
+                .buffers(vec![self.run_ends.inner().inner().clone()])
+                .build_unchecked()
+        };
+        Arc::new(PrimitiveArray::<R>::from(data))

Review Comment:
   It would be faster to make the PrimitiveArray directly rather than ArrayDat 
a-- among other things it doesn't require allocating a `vec` and it doesn't 
need unsafe:
   
   ```rust
       fn run_ends(&self) -> ArrayRef {
           let values  = self.run_ends.inner().clone();
           let nulls = None;
           Arc::new(PrimitiveArray::<R>::new(values, nulls))
       }
   ```



##########
arrow-array/src/array/run_array.rs:
##########
@@ -245,15 +245,22 @@ impl<R: RunEndIndexType> RunArray<R> {
     /// assert_eq!(new_run_array.run_ends().values(), &[2, 3, 5]);
     /// ```
     pub fn with_values(&self, values: ArrayRef) -> Self {
-        assert_eq!(values.len(), self.values().len());
+        assert_eq!(values.len(), self.values.len());
         let (run_ends_field, values_field) = match &self.data_type {
-            DataType::RunEndEncoded(r, v) => (r, v),
+            DataType::RunEndEncoded(r, v) => (
+                r,
+                Arc::new(

Review Comment:
   is this a bug fix? It seems like it is now (correctly) using the 
values.data_type rather than the the pre-exsisting data type
   
   I think the new code looks correct but should we add a test?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to