pepijnve commented on issue #16193:
URL: https://github.com/apache/datafusion/issues/16193#issuecomment-2913693236

   Yes this is more or less the same issue. PR #14028 proposed adding a yield 
point at the leaf of the plan when moving from one file to the next. This PR 
adds yield points closer to the top of the plan tree just below the 
AggregateExec's stream by wrapping its input and then yields every 64 input 
batches. I was wondering if that should be row count or time interval based 
rather than batch count based.
   
   The comments on PR #14028 regarding Tokio's `yield_now` are interesting and 
relevant for PR #16196 as well. Seems like the code pattern should be
   
   ```
   context::defer(cx.waker());
   return Poll::Pending;
   ```
   
   rather than
   
   ```
   cx.waker().wake_by_ref();
   return Poll::Pending;
   ```
   
   I can run some tests to see what the actual behavior is in the ST and MT 
Tokio runtimes if that helps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to