[I] Add deferred materialization for oversized coalesced takes in BatchCoalescer [arrow-rs]

via GitHub Sat, 18 Apr 2026 10:45:12 -0700


ClSlaid opened a new issue, #9760:
URL: https://github.com/apache/arrow-rs/issues/9760


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   This is a follow-up to the review on `#9758`:
   
   > \"Shouldn't it cap `indices` at limit?\"
   > https://github.com/apache/arrow-rs/pull/9758#discussion_r3104937679
   
   `BatchCoalescer::push_batch_with_indices` now chunks oversized `indices`, 
which fixes the immediate issue. However, for fallback schemas such as `Utf8`, 
the coalescer still eagerly materializes all indexed `take` chunks up front.
   
   This matters because `take` can amplify output size:
   - the input batch may have `N` rows
   - `indices.len()` may be much larger than `N`
   - indices may repeat rows many times
   
   For very large repeated takes, this fallback path is still noticeably 
expensive.
   
   **Describe the solution you'd like**
   
   Add deferred materialization for oversized indexed takes in `BatchCoalescer`.
   
   High-level idea:
   - when `push_batch_with_indices` falls back to materialized `take`
   - enqueue the pending indexed-take work instead of evaluating all chunks 
immediately
   - keep only a small ready window of completed batches
   - materialize more only as completed batches are consumed
   
   A reasonable first step may be to enable this only for fallback schemas such 
as `Utf8`, while leaving the existing direct path for primitive and view types 
unchanged.
   
   **Describe alternatives you've considered**
   
   - Keep the current eager chunked fallback
   - Restrict lazy materialization to fallback types only
   - Revisit API/error handling first, since deferred `take_record_batch(...)` 
work is awkward with `next_completed_batch()` returning `Option<RecordBatch>` 
instead of `Result<...>`
   
   **Additional context**
   
   We added an oversized-take benchmark with:
   - input batch size: `8192`
   - output indices length: `131072`
   - `biggest_coalesce_batch_size = Some(1024)`
   
   Current eager chunked behavior:
   - `primitive extra_large_repeat`: about `1.23 ms`
   - `mixed_utf8 extra_large_repeat`: about `76.6-81.2 ms`
   
   Experimental lazy prototype in a separate worktree:
   - `primitive extra_large_repeat`: about `1.32 ms`
   - `mixed_utf8 extra_large_repeat`: about `29.5 ms`
   
   These numbers are local and still imperfect, but they suggest deferred 
materialization could significantly improve oversized fallback `take`s.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Add deferred materialization for oversized coalesced takes in BatchCoalescer [arrow-rs]

Reply via email to