pepijnve commented on issue #16353:
URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2969854461

   > fixes any bugs / adds features over the current one,
   > Is "just" cleaner way to implement the same thing (this is also a fine 
thing to contribute as well).
   
   There are a couple of benefits.
   
   It removes the edge case seen in the interleave operator (or any `select!` 
style code in general). With the current per stream counter, one stream might 
want to yield, but the parent stream may decide to poll another stream in 
response which happens to be ready. The end result is that two cooperating 
streams may turn into a non-cooperating when they are merged. To fix this, you 
would need to adjust the merging operator as well and we're basically back 
where we started.
   If all cooperating streams use the same budget, then this problem goes away. 
Once the yield point has been hit, all cooperating streams will yield.
   
   Using the task budget also avoids the 'redundant yield' problem in the 
current version. If you now do a simple `SELECT * FROM ...` query, by default 
you'll get a `Pending` after every 64 `Ready(RecordBatch)`. With the task 
budget you will only actually inject the `Pending` when it's actually 
necessary. The system automatically does the right thing.
   
   Finally it aligns the cooperative yielding strategy across the library. 
`RecordBatchReceiverStream` is implicitly already using this strategy in a way 
you cannot opt out of. It's better to have one consist way of solving this 
cancellation problem once and for all.
   
   I have a patch almost ready. I'll make a draft PR already so this all 
becomes a bit more tangible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to