Re: [PR] Buffer builder hot path opts [arrow-rs]

via GitHub Wed, 11 Feb 2026 11:33:26 -0800


alamb commented on code in PR #9393:
URL: https://github.com/apache/arrow-rs/pull/9393#discussion_r2795101299



##########
arrow-select/src/interleave.rs:
##########
@@ -155,12 +157,12 @@ fn interleave_primitive<T: ArrowPrimitiveType>(
 ) -> Result<ArrayRef, ArrowError> {
     let interleaved = Interleave::<'_, PrimitiveArray<T>>::new(values, 
indices);
 
-    let values = indices
+    let values: ScalarBuffer<T::Native> = indices
         .iter()
         .map(|(a, b)| interleaved.arrays[*a].value(*b))
-        .collect::<Vec<_>>();
+        .collect();
 
-    let array = PrimitiveArray::<T>::try_new(values.into(), 
interleaved.nulls)?;
+    let array = PrimitiveArray::<T>::try_new(values, interleaved.nulls)?;

Review Comment:
   Basically there are a bunch of places in the code I think that make 
ScalarBuffer from `Vec`s and assume that is fast. If we can make it faster, 
maybe we can optimize that path (in one place) rather than having to change all 
call sites to use collect `ScalarBuffer` directly 🤔 



##########
arrow-select/src/interleave.rs:
##########
@@ -155,12 +157,12 @@ fn interleave_primitive<T: ArrowPrimitiveType>(
 ) -> Result<ArrayRef, ArrowError> {
     let interleaved = Interleave::<'_, PrimitiveArray<T>>::new(values, 
indices);
 
-    let values = indices
+    let values: ScalarBuffer<T::Native> = indices
         .iter()
         .map(|(a, b)| interleaved.arrays[*a].value(*b))
-        .collect::<Vec<_>>();
+        .collect();
 
-    let array = PrimitiveArray::<T>::try_new(values.into(), 
interleaved.nulls)?;
+    let array = PrimitiveArray::<T>::try_new(values, interleaved.nulls)?;

Review Comment:
   I don't understand why this would help
   
   The old code did `ScalarBuffer::from` 
   
   
https://github.com/apache/arrow-rs/blob/f9dd799f7baf9731ee01027c10cf6b64c94a793f/arrow-buffer/src/buffer/scalar.rs#L205-L212
   
   Which calls `Buffer::from_vec`
   
   Which makes a MutableBuffer by taking over the Vec allocation
   
https://github.com/apache/arrow-rs/blob/08e34ba2710abff8b6119a043ef8d5d0102b2b8e/arrow-buffer/src/buffer/mutable.rs#L813-L830
   
   Which then makes a buffer
   
   So TLDR is I don't understand why this saves a memory copy
   
   Or is the theory that all the shenanigans to make Vec -> MutableBuffer -> 
Bytes/Buffer can be reduced?



##########
arrow-buffer/src/buffer/mutable.rs:
##########
@@ -101,14 +101,25 @@ pub struct MutableBuffer {
     data: NonNull<u8>,
     // invariant: len <= capacity
     len: usize,
-    layout: Layout,
+    capacity: usize,

Review Comment:
   What is the rationale for inlining Layout here?
   
   It seems like Layout 
https://doc.rust-lang.org/beta/std/alloc/struct.Layout.html has the same 
representation and has accessors to get capacity and alignment 🤔 as usize
   
   https://doc.rust-lang.org/beta/std/alloc/struct.Layout.html#method.size
   https://doc.rust-lang.org/beta/std/alloc/struct.Layout.html#method.align
   
   If we reverted this change to Layout I think the diff would be easier to 
understand 



##########
arrow-buffer/src/buffer/mutable.rs:
##########
@@ -101,14 +101,25 @@ pub struct MutableBuffer {
     data: NonNull<u8>,
     // invariant: len <= capacity
     len: usize,
-    layout: Layout,
+    capacity: usize,
+    // cached alignment, Layout reconstructed when needed
+    align: usize,
 
     /// Memory reservation for tracking memory usage
     #[cfg(feature = "pool")]
     reservation: Mutex<Option<Box<dyn MemoryReservation>>>,
 }
 
 impl MutableBuffer {
+    /// Reconstruct Layout from cached capacity and alignment.
+    /// Only used in cold paths (alloc/dealloc/realloc).
+    #[inline]
+    fn layout(&self) -> Layout {
+        debug_assert!(self.align.is_power_of_two());

Review Comment:
   See comment above -- I don't understand the rationale for inlining the 
`Layout` 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Buffer builder hot path opts [arrow-rs]

Reply via email to