pepijnve commented on issue #16353: URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2964042212
Some more information to share and time to eat some humble pie for me. Google led me to a [withoutboats post](https://internals.rust-lang.org/t/runtime-agnostic-cooperative-task-scheduling-budget/18796) which led me to a [Tokio blog post](https://tokio.rs/blog/2020-04-preemption) which led to an aha moment. It’s worth reading the entire post but the key quote was > As long as the task has budget remaining, the resource operates as it did previously. Each asynchronous operation (actions that users must .await on) decrements the task's budget. Once the task is out of budget, all Tokio resources will perpetually return "not ready" until the task yields back to the scheduler. At that point, the budget is reset, and future .awaits on Tokio resources will again function normally. So this is the “consume at the source” idea we have now, but with a task-wide latch per tick rather than per resource. And the latch only resets when you actually yield to the runtime. This removes all the edge cases as long as we ensure all sources are using the same task budget. As we’ve discussed above the channel receiver is already doing that for us. For some reason file IO was not. I’m not sure I understand why that’s the case and will try to figure out why tomorrow. Perhaps we can have the “it works automagically” cake and eat it after all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org