crepererum commented on code in PR #6310:
URL: https://github.com/apache/arrow-datafusion/pull/6310#discussion_r1189155285


##########
datafusion/core/src/physical_plan/repartition/mod.rs:
##########
@@ -532,9 +541,28 @@ impl RepartitionExec {
                 timer.done();
             }
 
-            // If the input stream is endless, we may spin forever and never 
yield back to tokio. Hence let us yield.
-            // See https://github.com/apache/arrow-datafusion/issues/5278.
-            tokio::task::yield_now().await;
+            // If the input stream is endless, we may spin forever and

Review Comment:
   You can call it a bug or a design issue of DF / tokio. But if you run two 
spawned tasks and one never returns to Tokio then the other will never run. 
Unbounded buffers are NOT avoidable in the current DF design, because you 
cannot predict tokio scheduling and hash outputs. So the fix here is adequate. 
`consume_budget` would be the better solution but it's an unstable tokio 
feature, so that's not usable. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to