Re: [I] [Epic] Pipeline breaking cancellation support and improvement [datafusion]

via GitHub Fri, 13 Jun 2025 04:35:16 -0700


pepijnve commented on issue #16353:
URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2970116925


   > This is true in theory -- but I think we also take pains to try and avoid 
"over scheduling" tasks in tokio -- for example, we purposely only have N input 
partitions (and hence N streams) per scan, even if there are 100+ files -- the 
goal is to keep all the cores busy, but not oversubscribed.
   
   What I was trying to say is that from a scheduling/yielding pov you can 
reason about each box in isolation. Whether you actually try to make 100s of 
concurrent (not parallel) tasks or not is a rabbit hole for another thread 😄


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] [Epic] Pipeline breaking cancellation support and improvement [datafusion]

Reply via email to