pepijnve commented on issue #16353: URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2970116925
> This is true in theory -- but I think we also take pains to try and avoid "over scheduling" tasks in tokio -- for example, we purposely only have N input partitions (and hence N streams) per scan, even if there are 100+ files -- the goal is to keep all the cores busy, but not oversubscribed. What I was trying to say is that from a scheduling/yielding pov you can reason about each box in isolation. Whether you actually try to make 100s of concurrent (not parallel) tasks or not is a rabbit hole for another thread 😄 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org