guillaume-rochette-oxb opened a new pull request, #48963: URL: https://github.com/apache/arrow/pull/48963
### Rationale for this change Hi everyone, This PR solve the issue #48962. In the sense that it adds a functionality enabling to dynamically restack/resize a stream of pa.RecordBatch w.r.t. to minimums and maximums of rows and bytes. That would means that too large batches would be chunked in smaller ones, and conversely small batches would be concatenated into bigger ones. That way we would have more predictable resource usage for parallel processing tasks. ### What changes are included in this PR? The function `restack_batches()` and its unit tests 😃 ### Are these changes tested? Yes 🫡 ### Are there any user-facing changes? No 🙅 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
