Re: [I] [Epic] Pipeline breaking cancellation support and improvement [datafusion]

via GitHub Fri, 13 Jun 2025 03:07:24 -0700


alamb commented on issue #16353:
URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2969824040


   > As we’ve discussed above the channel receiver is already doing that for 
us. For some reason file IO was not. I’m not sure I understand why that’s the 
case and will try to figure out why tomorrow.
   
   This is consistent with our observations at InfluxData:  we saw 
uncancellable queries when feeding our plan from an in memory cache (not a file 
/ memory)
   
   > 
https://github.com/pepijnve/datafusion/blob/cancel_spec/dev/design/cancellation.md
   
   This is a really nice writeup: it matches my understanding / mental model. 
It would also make the start of a great blog post for the DataFusion blog  FWIW 
and I filed a ticket to track that idea 🎣 :
   - https://github.com/apache/datafusion/issues/16396
   
   > The more I read the more I can see DataFusion is basically a modern day 
Volcano.
   
   I think this is an accurate assessment, though I would probably phrase it as 
"DataFusion uses Volcano-style parallelism where operators are single threaded 
and Exchange (`RepartitionExec`) operators handle parallelism". The other 
prevalent style is called "Morsel Driven Parallelism" popularized by DuckDB and 
TUM/Umbra [in this paper](https://db.in.tum.de/~leis/papers/morsels.pdf) which 
uses operators that are explicitly multi-threaded.
   
   > Each of the colored blocks is an independently executing sub portion of 
the query. Translated to Tokio each of these colored blocks is a separate 
concurrent task. Each of those tasks needs to be cooperatively scheduled to 
guarantee all of them get a fair share of time to run.
   
   This is true in theory -- but I think we also take pains to try and avoid 
"over scheduling" tasks in tokio -- for example, we purposely only have `N` 
input partitions (and hence N streams) per scan, even if there are 100+ files 
-- the goal is to keep all the cores busy, but not oversubscribed.
   
   > So what does all this mean in terms of implementation:
   
   This also sounds fine to me, and would be happy to review PRs, etc. However 
it is not 100% clear  if your proposed design 
   1. fixes any bugs / adds features over the current one, 
   2. Is "just" cleaner way to implement the same thing (this is also a fine 
thing to contribute as well). 
   
   For example, I wonder if there are additional tests / cases that would be 
improved with the proposed implementation 🤔 
   
   > The one thing that we still cannot solve automatically then is dynamic 
query planning. Operators that create streams dynamically still have to make 
sure they set things up correctly themselves.
   
   In my opinion this is fine -- if operators are making dynamic streams, that 
is an advanced usecase that today must still handle canceling / yielding. I 
think it is ok if we can't find a way to automatically provide yielding 
behavior to them (they are no worse off then today)
   
   > One possible downside to this approach is that the cooperative scheduling 
budget is implementation specific to the Tokio runtime. DataFusion becomes more 
tied to Tokio rather than less. Not sure if that's an issue or not.
   
   I personally don't think this is an issue as I don't see any movement and 
have not heard any desire to move away from tokio.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] [Epic] Pipeline breaking cancellation support and improvement [datafusion]

Reply via email to