tustvold commented on code in PR #3115:
URL: https://github.com/apache/arrow-rs/pull/3115#discussion_r1023452284
##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -489,6 +544,42 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
)
}
}
+
+ /// Returns `PrimitiveBuilder` of this primitive array for mutating its
values if the underlying
+ /// data buffer is not shared by others.
+ pub fn into_builder(self) -> Result<PrimitiveBuilder<T>, Self> {
+ let null_buffer = self
+ .data
+ .null_buffer()
+ .cloned()
+ .and_then(|b| b.into_mutable(0).ok());
+
+ let len = self.len();
+ let null_bit_buffer = self.data.null_buffer().cloned();
+
+ let buffer = self.data.buffers()[0].clone();
Review Comment:
Same thing here, I think this needs to call `slice_with_length`
##########
arrow-buffer/src/buffer/immutable.rs:
##########
@@ -227,6 +227,25 @@ impl Buffer {
pub fn count_set_bits_offset(&self, offset: usize, len: usize) -> usize {
UnalignedBitChunk::new(self.as_slice(), offset, len).count_ones()
}
+
+ /// Returns `MutableBuffer` for mutating the buffer if this buffer is not
shared.
+ /// Returns `Err` if this is shared or its allocation is from an external
source.
+ pub fn into_mutable(self, len: usize) -> Result<MutableBuffer, Self> {
Review Comment:
Why does this take a len? How does this differ from `Buffer::length`?
##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -397,6 +397,61 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
unsafe { build_primitive_array(len, buffer, null_count, null_buffer) }
}
+ /// Applies an unary and infallible function to a mutable primitive array.
+ /// Mutable primitive array means that the buffer is not shared with other
arrays.
+ /// As a result, this mutates the buffer directly without allocating new
buffer.
+ ///
+ /// # Implementation
+ ///
+ /// This will apply the function for all values, including those on null
slots.
+ /// This implies that the operation must be infallible for any value of
the corresponding type
+ /// or this function may panic.
+ /// # Example
+ /// ```rust
+ /// # use arrow_array::{Int32Array, types::Int32Type};
+ /// # fn main() {
+ /// let array = Int32Array::from(vec![Some(5), Some(7), None]);
+ /// let c = array.unary_mut(|x| x * 2 + 1).unwrap();
+ /// assert_eq!(c, Int32Array::from(vec![Some(11), Some(15), None]));
+ /// # }
+ /// ```
+ pub fn unary_mut<F>(self, op: F) -> Result<PrimitiveArray<T>,
PrimitiveArray<T>>
Review Comment:
I wonder if we can use `into_builder` here, to avoid duplication?
##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -1939,4 +2032,52 @@ mod tests {
array.value(4);
}
+
+ #[test]
+ fn test_into_builder() {
+ let array: Int32Array = vec![1, 2, 3].into_iter().map(Some).collect();
+
+ let boxed: ArrayRef = Arc::new(array);
+ let col: Int32Array = downcast_array(&boxed);
+ drop(boxed);
+
+ let mut builder = col.into_builder().unwrap();
+
+ builder.append_value(4);
+ builder.append_null();
+ builder.append_value(2);
+
+ let expected: Int32Array = vec![Some(4), None,
Some(2)].into_iter().collect();
Review Comment:
I would have expected `into_builder` to keep the current values?
##########
arrow-buffer/src/buffer/mutable.rs:
##########
@@ -92,6 +93,23 @@ impl MutableBuffer {
}
}
+ /// Allocates a new [MutableBuffer] from given `Bytes`.
+ pub(crate) fn from_bytes(bytes: Bytes, len: usize) -> Result<Self, Bytes> {
Review Comment:
How does the provided `len` different from `Bytes::len`?
##########
arrow-array/src/builder/buffer_builder.rs:
##########
@@ -124,6 +124,14 @@ impl<T: ArrowNativeType> BufferBuilder<T> {
}
}
+ pub fn new_from_buffer(buffer: MutableBuffer) -> Self {
+ Self {
+ buffer,
+ len: 0,
Review Comment:
```suggestion
len: buffer.len() / std::mem::size_of::<T::Native>(),
```
?
##########
arrow-array/src/builder/boolean_buffer_builder.rs:
##########
@@ -33,6 +33,10 @@ impl BooleanBufferBuilder {
Self { buffer, len: 0 }
}
+ pub fn new_from_buffer(buffer: MutableBuffer) -> Self {
Review Comment:
I think this need to be provided with the length in bits?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]