ClSlaid opened a new issue, #9760: URL: https://github.com/apache/arrow-rs/issues/9760
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** This is a follow-up to the review on `#9758`: > \"Shouldn't it cap `indices` at limit?\" > https://github.com/apache/arrow-rs/pull/9758#discussion_r3104937679 `BatchCoalescer::push_batch_with_indices` now chunks oversized `indices`, which fixes the immediate issue. However, for fallback schemas such as `Utf8`, the coalescer still eagerly materializes all indexed `take` chunks up front. This matters because `take` can amplify output size: - the input batch may have `N` rows - `indices.len()` may be much larger than `N` - indices may repeat rows many times For very large repeated takes, this fallback path is still noticeably expensive. **Describe the solution you'd like** Add deferred materialization for oversized indexed takes in `BatchCoalescer`. High-level idea: - when `push_batch_with_indices` falls back to materialized `take` - enqueue the pending indexed-take work instead of evaluating all chunks immediately - keep only a small ready window of completed batches - materialize more only as completed batches are consumed A reasonable first step may be to enable this only for fallback schemas such as `Utf8`, while leaving the existing direct path for primitive and view types unchanged. **Describe alternatives you've considered** - Keep the current eager chunked fallback - Restrict lazy materialization to fallback types only - Revisit API/error handling first, since deferred `take_record_batch(...)` work is awkward with `next_completed_batch()` returning `Option<RecordBatch>` instead of `Result<...>` **Additional context** We added an oversized-take benchmark with: - input batch size: `8192` - output indices length: `131072` - `biggest_coalesce_batch_size = Some(1024)` Current eager chunked behavior: - `primitive extra_large_repeat`: about `1.23 ms` - `mixed_utf8 extra_large_repeat`: about `76.6-81.2 ms` Experimental lazy prototype in a separate worktree: - `primitive extra_large_repeat`: about `1.32 ms` - `mixed_utf8 extra_large_repeat`: about `29.5 ms` These numbers are local and still imperfect, but they suggest deferred materialization could significantly improve oversized fallback `take`s. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
