gstvg commented on issue #5700:
URL: https://github.com/apache/arrow-rs/issues/5700#issuecomment-2111366957

   Hi @HadrienG2
   Are you working on array builders only, or accessing concrete arrays data 
too?
   I've been playing with typed accessor for concrete arrays for the past few 
weeks, and pushed part of it as a draft #5767 
   
   My current approach is use derive macros on struct of arrays
   
   
   ```
   #[derive(StructArray)]
   struct MyStructArray<'a> {
      number: &'a Int32Array,
      text: &'a StringArray,
   }
   ```
   
   This would implement `TryFrom<&'a StructArray>` and 
`TypedStructInnerAccessor` for `MyStructArray` and `ArrayAccessor` for 
`TypedStruct<T: TypedStructInnerAccessor> { type Item = 
TypedStructInnerAccessor::Item  } `
   
   where `TypedStructInnerAccessor::Item ` is a generated type with a few 
methods for each column:
   
   ```
   struct MyStructArrayStructAccessorItem<'a> {
      array: TypedStructArray<&'a MyStructArray>,
      index: usize
   }
   
   impl<'a> MyStructArrayStructAccessorItem<'a> {
   
       #[inline]
       fn is_valid(&self) -> ::std::primitive::bool {
           self.struct_.is_valid(self.index)
       }
       #[inline]
       fn is_null(&self) -> ::std::primitive::bool {
           self.struct_.is_null(self.index)
       }
       #[inline]
       fn text(&self) -> <&'a StringArray as 
::arrow::array::ArrayAccessor>::Item {
           (&self.struct_.fields().text).value(self.index)
       }
       #[inline]
       fn text_opt(
           &self,
       ) -> ::std::option::Option<
           <&'a StringArray as ::arrow::array::ArrayAccessor>::Item,
       > {
           if (&self.struct_.fields().text).is_valid(self.index) {
               Some((&self.struct_.fields().text).value(self.index))
           } else {
               None
           }
       }
       #[inline]
       fn is_text_valid(&self) -> ::std::primitive::bool {
           self.struct_.fields().text.is_valid(self.index)
       }
   
   
   ... the same methods for text
   
   }
   
   ```
   
   `TypedStructArray` def:
   
   
   ```
   #[derive(Debug)]
   pub struct TypedStructArray<'a, T> {
       fields: T,
       struct_: &'a StructArray,
   }
   
   impl<T> TypedStructArray<T> {
       pub fn fields(&self) -> &T {
           &self.fields
       }
   }
   
   impl<'a, T: TryFrom<&'a StructArray, Error = ArrowError>> TryFrom<&'a dyn 
Array>
       for TypedStructArray<T>
   {
       type Error = ArrowError;
   
       fn try_from(value: &'a dyn Array) -> Result<Self, Self::Error> {
           let struct_ = <&'a StructArray>::try_from(value)?;
   
           Ok(Self {
               fields: T::try_from(&struct_)?,
               struct_: struct,
           })
       }
   }
   
   impl<'a, T: TypedStructInnerAccessor<'a>> ArrayAccessor for &'a 
TypedStructArray<T> {
       type Item = T::Item;
   
       fn value(&self, index: ::std::primitive::usize) -> Self::Item {
           assert!(
               index < self.len(),
               "Trying to access an element at index {} from a TypedStructArray 
of length {}",
               index,
               self.len()
           );
           unsafe { self.value_unchecked(index) }
       }
   
       unsafe fn value_unchecked(&self, index: ::std::primitive::usize) -> 
Self::Item {
           (*self, index).into()
       }
   }
   pub trait TypedStructInnerAccessor<'a>: std::fmt::Debug + Send + Sync + 
Sized + 'a {
       type Item: std::fmt::Debug + Send + Sync + From<(&'a 
TypedStructArray<Self>, usize)>;
   }
   
   impl<T: std::fmt::Debug + Send + Sync> Array for TypedStructArray<T> {
       fn as_any(&self) -> &dyn std::any::Any {
           self.struct_.as_any()
       }
   ... forward other Array methods to self.struct_
   }
   ```
   
   The reason for returning a intermediary value with methods instead of a 
tuple or a struct with values is to not access any memory that the user may not 
want
   
   I also using the same approach for `RecordBatch`, `Union`, `Map`, 
`GenericList`, `FixedSizeList` and `FixedSizeBinary`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to