Rachelint opened a new pull request, #23269: URL: https://github.com/apache/datafusion/pull/23269
## Which issue does this PR close? - Closes #. ## Rationale for this change The current producer-side repartition coalescer is shared by all input tasks for each output partition. That adds synchronization around every coalesced batch path when multiple input tasks target the same output partition. ## What changes are included in this PR? This PR replaces the shared per-output-partition coalescer with local per-producer-channel coalescers in `RepartitionExec`: - each non-preserve-order output channel owns its own `LimitedBatchCoalescer` - preserve-order mode still skips producer-side coalescing and relies on `StreamingMergeBuilder` - local coalescers are finalized by their owning input task at end of input - the shared `Arc<Mutex<LimitedBatchCoalescer>>` and active-sender tracking are removed ## Are these changes tested? Ran: - `cargo fmt --all` - `cargo check -p datafusion-physical-plan` - `cargo clippy -p datafusion-physical-plan --all-targets --all-features -- -D warnings` - `cargo clippy --all-targets --all-features -- -D warnings` Existing repartition tests cover the coalescing and spilling paths. ## Are there any user-facing changes? No user-facing API changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
