pepijnve commented on issue #16353:
URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2970025341

   > I am curious what's the budget count since we can't config it from 
datafusion, will it affect performance or other things? It seems not, because 
we already use RecordBatchReceiverStream for the budget?
   
   @zhuqi-lucas that's correct, you can't configure it at the moment. That's 
the case for `RecordBatchReceiverStream` today as well indeed. Tokio hardcodes 
the magic number `128` (see 
https://github.com/tokio-rs/tokio/blob/master/tokio/src/task/coop/mod.rs#L116).
   
   > If we have to share the one budget for all leaf nodes, will some leaf node 
very aggressive consuming budget will affect the total fairness or performance?
   
   The budget is per spawned task. Every time the tokio scheduler lets a task 
run it gives it a budget of 128 which the task can then deplete until it hits 
zero. Then the task is coaxed towards yielding by making all budget aware Tokio 
resources return `Pending`.
   From the perspective of DataFusion code I don't think this really changes 
all that much. It's the exact same behavior you have today already when the 
source streams are `RecordBatchReceiverStream`. So the moment you have a 
repartition/coalesce you're getting exactly this with the current code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to