andygrove opened a new pull request, #3661:
URL: https://github.com/apache/datafusion-comet/pull/3661

   ## Summary
   
   - In `append_struct_fields_field_major`, the first pass now collects nested 
struct addresses and sizes alongside the null bitmap
   - The per-field second pass uses these pre-collected addresses via 
`point_to()` instead of re-reading from parent row pointer arrays 
(`read_row_at!`) and calling `get_struct()` for every field of every row
   - Same optimization applied to the Binary, Utf8, Decimal128, nested Struct, 
and List/Map field cases
   
   ## Rationale
   
   Previously, for a struct with F fields and N rows, the code performed N*F 
pointer dereferences into the parent row address/size arrays plus N*F 
`get_struct()` calls (each involving `get_offset_and_len` which reads an i64 
and does bit manipulation). After this change, parent row reads and 
`get_struct` calls happen only N times total in the first pass, and the second 
pass uses cheap `point_to()` calls with the cached addresses.
   
   ## Test plan
   
   - [x] `cargo clippy --all-targets --workspace -- -D warnings` passes
   - [ ] Existing struct row-to-columnar tests cover these code paths


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to