pepijnve commented on issue #16353:
URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2963507764

   @alamb our messages crossed paths in the ether.
   
   > I have somewhat lost track of what exactly you are proposing.
   
   Sorry, I usually do this around a white board. First and foremost I'm trying 
to communicate what I see as intrinsic problems with the external optimizer 
rule approach. This was countered with a belief that it can be solved, but I 
don't see it yet myself. Happy to be proven wrong of course.
   I've explained the problem to 3 or 4 colleagues in the meantime as unbiased 
as I possibly could and they basically all tell me the same.
   
   >Is it the approach in one of these PRs:
   > https://github.com/apache/datafusion/pull/16301
   > https://github.com/apache/datafusion/pull/16319
   
   A bit of both :)
   
   The rationale behind #16319 is twofold.
   - By using a JoinHandle you naturally solve the cancellation problem. A 
JoinHandle is pending once, and wakes when the result is ready. The caller is 
unblocked immediately and can `select!` on a timeout. The subtle effect of it 
is that you shift from a cancellation problem for the caller to an abort 
problem for the spawned task. Aborting an AbortHandle still requires 
cooperation from the task in order for it to actually stop.
   - My hunch was that by splitting deep call stacks into multiple chained 
shallow call stacks the cost of yielding to the runtime could be reduced. That 
was just a first principles reasoning thing that I wanted to try out. Still 
working on measuring if it has a performance benefit.
   
   #16301 is the variant of yielding where we would punt on the idea of 
implementing yielding via an optimizer rule or some other automatic mechanism. 
Instead you state that every `Stream` is responsible for its own behavior 
(which the guidelines kind of already do). And then we fix the base library 
implementations.
   I think this fixes cancellation for everyone except downstream users with 
custom pipeline breaking operators. For those people I would propose providing 
utilities in the library to make doing the right thing as low-effort as 
possible and provide best practice examples. Something akin to "If your code 
has looping patterns call 'consume_budget' every so often" or "wrap your input 
stream in a YieldStream".
   #16301 does not make use of Tokio's coop budget yet however. Instead it's 
the per-operator counter variant. The diff per operator does illustrate the 
scope and nature of the change `Stream` implementers would need to make.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to