pepijnve commented on issue #16353: URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2963459881
@zhuqi-lucas it would be useful to have some additional voices in this discussion. I can share my opinion, but it's only one opinion. I feel like I'm just going to keep repeating the same critique over and over again otherwise. In short, I'm not convinced this is a physical plan optimization problem. It was a good idea, but I don't think it can be refined into something sufficiently precise. By trying to make a generic/universal solution you end up with something unnecessarily complicated. I see this as a `Stream` implementation problem. The place to fix this is in the Rust code of the stream implementations because the decision to consume budget or not is tightly coupled to the implementation logic. You're trying to count how many times a Future or Stream was able to make progress towards its goal. A YieldStream is always going to be a crude approximation of that. Implementations may opt to use YieldStream as a convenience of course. You could try to describe the need for a YieldStream around a child declaratively, but the ability to switch streams dynamically is at odds with a one-shot approach like an optimizer rule. Additionally, but this is arguably just aesthetic, I don't think you ever want to see YieldExec showing up in explain plans. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org