andygrove opened a new pull request, #3661: URL: https://github.com/apache/datafusion-comet/pull/3661
## Summary - In `append_struct_fields_field_major`, the first pass now collects nested struct addresses and sizes alongside the null bitmap - The per-field second pass uses these pre-collected addresses via `point_to()` instead of re-reading from parent row pointer arrays (`read_row_at!`) and calling `get_struct()` for every field of every row - Same optimization applied to the Binary, Utf8, Decimal128, nested Struct, and List/Map field cases ## Rationale Previously, for a struct with F fields and N rows, the code performed N*F pointer dereferences into the parent row address/size arrays plus N*F `get_struct()` calls (each involving `get_offset_and_len` which reads an i64 and does bit manipulation). After this change, parent row reads and `get_struct` calls happen only N times total in the first pass, and the second pass uses cheap `point_to()` calls with the cached addresses. ## Test plan - [x] `cargo clippy --all-targets --workspace -- -D warnings` passes - [ ] Existing struct row-to-columnar tests cover these code paths -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
