tustvold commented on code in PR #3134:
URL: https://github.com/apache/arrow-rs/pull/3134#discussion_r1028142772
##########
arrow-array/src/builder/primitive_builder.rs:
##########
@@ -228,6 +228,14 @@ impl<T: ArrowPrimitiveType> PrimitiveBuilder<T> {
pub fn values_slice_mut(&mut self) -> &mut [T::Native] {
self.values_builder.as_slice_mut()
}
+
+ /// Returns the current values buffer and null buffer as a slice
+ pub fn as_slice(&mut self) -> (&mut [T::Native], Option<&[u8]>) {
+ (
+ self.values_builder.as_slice_mut(),
+ self.null_buffer_builder.as_slice(),
+ )
Review Comment:
```suggestion
pub fn slices_mut(&mut self) -> (&mut [T::Native], Option<&mut [u8]>) {
(
self.values_builder.as_slice_mut(),
self.null_buffer_builder.as_slice_mut(),
)
```
Or something, it seems a bit unusual for both to not be mutable.
We could then add `validity_slice` and `validity_slice_mut` for completeness
##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -469,6 +469,42 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
})
}
+ /// Applies an unary and fallible function to all valid values in a
mutable primitive array.
+ /// Mutable primitive array means that the buffer is not shared with other
arrays.
+ /// As a result, this mutates the buffer directly without allocating new
buffer.
+ ///
+ /// This is unlike [`Self::unary_mut`] which will apply an infallible
function to all rows
+ /// regardless of validity, in many cases this will be significantly
faster and should
+ /// be preferred if `op` is infallible.
+ ///
+ /// This returns an `Err` for two cases. First is input array is shared
buffer with other
+ /// array. In the case, returned `Err` wraps a `Ok` of input array.
Second, if the function
+ /// encounters an error during applying on values. In the case, returned
`Err` wraps an
+ /// `Err` of the actual error.
+ ///
+ /// Note: LLVM is currently unable to effectively vectorize fallible
operations
+ pub fn try_unary_mut<F, E>(
+ self,
+ op: F,
+ ) -> Result<PrimitiveArray<T>, Result<PrimitiveArray<T>, E>>
Review Comment:
```suggestion
) -> Result<Result<PrimitiveArray<T>, E>, PrimitiveArray<T>>
```
I think the reverse result order makes more sense, as it represents the
order of fallibility. If we can't convert to a builder is the first error case,
then within that we have the error case of ArrowError.
I think it will also make it easier to implement a fallback, e.g.
```
arr.try_unary_mut(&mut op).unwrap_or_else(|arr| arr.try_unary(&mut op))?
```
##########
arrow-array/src/builder/null_buffer_builder.rs:
##########
@@ -150,6 +150,11 @@ impl NullBufferBuilder {
self.bitmap_builder = Some(b);
}
}
+
+ #[inline]
+ pub fn as_slice(&self) -> Option<&[u8]> {
Review Comment:
:+1:
Could also add an `as_slice_mut` for completeness
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]