Re: [PR] [Variant] Optimize the object header generation logic in ObjectBuilder::finish [arrow-rs]

via GitHub Fri, 01 Aug 2025 07:04:28 -0700


alamb commented on code in PR #8031:
URL: https://github.com/apache/arrow-rs/pull/8031#discussion_r2248075157



##########
parquet-variant/src/builder.rs:
##########
@@ -64,19 +64,55 @@ fn write_offset(buf: &mut Vec<u8>, value: usize, nbytes: 
u8) {
     buf.extend_from_slice(&bytes[..nbytes as usize]);
 }
 
-/// Write little-endian integer to buffer at a specific position
-fn write_offset_at_pos(buf: &mut [u8], start_pos: usize, value: usize, nbytes: 
u8) {
-    let bytes = value.to_le_bytes();
-    buf[start_pos..start_pos + nbytes as 
usize].copy_from_slice(&bytes[..nbytes as usize]);
-}
-
 /// Append `value_size` bytes of given `value` into `dest`.
 fn append_packed_u32(dest: &mut Vec<u8>, value: u32, value_size: usize) {
     let n = dest.len() + value_size;
     dest.extend(value.to_le_bytes());
     dest.truncate(n);
 }
 
+/// An iterator that yields the bytes of a packed u32 iterator.
+/// Will yield the first `packed_bytes` bytes of each item in the iterator.
+struct PackedU32Iterator<T: Iterator<Item = [u8; 4]>> {
+    packed_bytes: usize,

Review Comment:
   > However -- the temp buf append+truncate is fundamentally faster on a 
per-entry basis, because of how much simpler it is at machine code level. And 
it will run at exactly the same speed for all packing sizes, because there are 
no branches or even conditional moves based on the packing size (**). The only 
downside is the up-front cost of allocating said temp buffer first, which has 
to amortize across all entries.
   
   Yes I agree with this analysis. 
   
   This is why my opinion still remains that the way to avoid the allocation is 
to calculate the header size and shift the bytes first so we write directly to 
the target. I realize that is trickier code, but as @klion26 has said somewhere 
else I think we could write some good assertions and tests to avoid problems
   
   When I next get a chance to work on Variant related things with time, I want 
to focus on unsticking the shredding implementation. Once that is done, I may 
return here to try and restructure things
   
   One thought I had was to encapsulate the logic / templating in a 
   
   ```rust
   struct ListHeaderWriter {
     offsets: &[usize],
   }
   
   impl ListHeaderWriter {
     fn header_size::<const OFFSET_SIZE:usize>(&self) -> usize {
       // put the header size calculation here
     }
     //  write the header into dst starting at start_offset
     fn write(dst: &mut Vec<u8>, start_offset: usize)::<const 
OFFSET_SIZE:usize>(&self) {
      ... 
     }
   }
   ```
   
   
     



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [Variant] Optimize the object header generation logic in ObjectBuilder::finish [arrow-rs]

Reply via email to