[PR] perf: use aligned slice access in SparkUnsafeArray bulk append [datafusion-comet]

via GitHub Tue, 10 Mar 2026 17:12:39 -0700


andygrove opened a new pull request, #3659:
URL: https://github.com/apache/datafusion-comet/pull/3659


   ## Summary
   
   - Replace per-element `read_unaligned()` + manual pointer arithmetic with 
slice-based indexed access in all nullable bulk append paths 
(`impl_append_to_builder` macro, `append_booleans`, `append_timestamps`, 
`append_dates`)
   - Remove runtime alignment checks in non-nullable paths since alignment is 
guaranteed by SparkUnsafeArray layout, always use `append_slice` bulk copy
   - Add `debug_assert` to verify the alignment invariant
   
   ## Rationale
   
   SparkUnsafeArray guarantees natural alignment for element data: the header 
is `8 + ceil(n/64)*8` bytes (always 8-byte aligned), and elements are at 
`element_size` stride from the aligned base. The nullable paths were previously 
doing per-element `ptr.read_unaligned()` with `ptr = ptr.add(1)`, while the 
non-nullable paths had runtime alignment checks with fallback to unaligned 
reads. Since alignment is guaranteed, all paths can use slice access, which 
gives better codegen (no manual pointer arithmetic, compiler can reason about 
bounds).
   
   ## Test plan
   
   - [x] `cargo clippy --all-targets --workspace -- -D warnings` passes
   - [ ] Existing row-to-columnar and array element append tests/benchmarks 
cover these code paths


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] perf: use aligned slice access in SparkUnsafeArray bulk append [datafusion-comet]

Reply via email to