pepijnve commented on issue #16353:
URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2963459881

   @zhuqi-lucas it would be useful to have some additional voices in this 
discussion. I can share my opinion, but it's only one opinion. I feel like I'm 
just going to keep repeating the same critique over and over again otherwise.
   
   In short, I'm not convinced this is a physical plan optimization problem. It 
was a good idea, but I don't think it can be refined into something 
sufficiently precise. By trying to make a generic/universal solution you end up 
with something unnecessarily complicated.
   
   I see this as a `Stream` implementation problem. The place to fix this is in 
the Rust code of the stream implementations because the decision to consume 
budget or not is tightly coupled to the implementation logic. You're trying to 
count how many times a Future or Stream was able to make progress towards its 
goal. A YieldStream is always going to be a crude approximation of that. 
Implementations may opt to use YieldStream as a convenience of course.
   
   You could try to describe the need for a YieldStream around a child 
declaratively, but the ability to switch streams dynamically is at odds with a 
one-shot approach like an optimizer rule. Additionally, but this is arguably 
just aesthetic, I don't think you ever want to see YieldExec showing up in 
explain plans.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to