Dandandan opened a new pull request, #20764:
URL: https://github.com/apache/datafusion/pull/20764

   ## Which issue does this PR close?
   
   - Closes #.
   
   ## Rationale for this change
   
   The current implementation sends error messages to output partitions 
sequentially using a for loop. This can be inefficient when there are many 
output partitions, as each `send()` operation is awaited individually. By using 
`futures::future::join_all()`, we can parallelize these operations, allowing 
multiple sends to happen concurrently rather than sequentially.
   
   This change improves performance in error scenarios by reducing the total 
time spent notifying all output partitions of errors or completion.
   
   ## What changes are included in this PR?
   
   - Added `use futures::future::join_all;` import
   - Refactored three error/completion handling paths in 
`RepartitionExec::run_input_partition()`:
     1. Join error handling: Changed from sequential loop to concurrent sends 
using `join_all()`
     2. Input task error handling: Changed from sequential loop to concurrent 
sends using `join_all()`
     3. Successful completion: Changed from sequential loop to concurrent sends 
using `join_all()`
   - Each path now uses `txs.into_values().map()` to convert the channel 
senders into async closures that are executed concurrently
   
   ## Are these changes tested?
   
   The changes are covered by existing tests in the DataFusion test suite. The 
refactoring maintains the same functional behavior (all error messages and 
completion signals are still sent to all output partitions), only changing the 
execution model from sequential to concurrent.
   
   ## Are there any user-facing changes?
   
   No user-facing changes. This is an internal optimization to the physical 
execution layer that improves performance without changing the external API or 
behavior.
   
   https://claude.ai/code/session_01GDTBavJzih6tSSBd9SRNmk


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to