tustvold commented on PR #13423: URL: https://github.com/apache/datafusion/pull/13423#issuecomment-2480521560
I took a quick look at this and content looks good, couldn't check diagrams as on phone. > Why can't we all use spawn_blocking() for all CPU-bounded task, and instead we have to use two runtimes explicitly We _could_, however, to preserve the thread per core architecture we would need to cap the threads of the blocking pool to the core count. Then as blocking tasks can't yield waiting for input, every CPU bound morsel would have to be spawned separately. This _would_ give us a "morsel-driven" scheduler, however, tokio has a relatively high per task overhead and so even discounting the sheer amount of boilerplate this would require, the performance would be regrettable. Ultimately spawn_blocking is designed for blocking **IO**, it is not designed for CPU bound tasks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org