Rachelint commented on PR #11943: URL: https://github.com/apache/datafusion/pull/11943#issuecomment-2288031573
@2010YOUY01 After checking the codes about memory contorl, I think I got it. - `emit_early_if_necessary` is used in `Partial` - and `spill_previous_if_necessary` is used in the final phases They all server for the spilling. And the logic may be like this: - After reaching the memory limit, force the `Partial` to submit batches to `Final` as soon as possible - And the `Final` will spill them to disk for avoid oom - After all batches are submitted to `Final`, the `Final` merged the spilled batches and in-memory batches to get the final results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org