pepijnve commented on issue #16193: URL: https://github.com/apache/datafusion/issues/16193#issuecomment-2913693236
Yes this is more or less the same issue. PR #14028 proposed adding a yield point at the leaf of the plan when moving from one file to the next. This PR adds yield points closer to the top of the plan tree just below the AggregateExec's stream by wrapping its input and then yields every 64 input batches. I was wondering if that should be row count or time interval based rather than batch count based. The comments on PR #14028 regarding Tokio's `yield_now` are interesting and relevant for PR #16196 as well. Seems like the code pattern should be ``` context::defer(cx.waker()); return Poll::Pending; ``` rather than ``` cx.waker().wake_by_ref(); return Poll::Pending; ``` I can run some tests to see what the actual behavior is in the ST and MT Tokio runtimes if that helps. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org