Rachelint commented on PR #11943:
URL: https://github.com/apache/datafusion/pull/11943#issuecomment-2288031573

   @2010YOUY01 After checking the codes about memory contorl, I think I got it.
   - `emit_early_if_necessary` is used in `Partial`
   - and `spill_previous_if_necessary` is used in the final phases
   
   They all server for the spilling. And the logic may be like this:
   - After reaching the memory limit, force the `Partial` to submit batches to 
`Final` as soon as possible
   - And the `Final` will spill them to disk for avoid oom
   - After all batches are submitted to `Final`, the `Final` merged the spilled 
batches and in-memory batches to get the final results.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to