EmilyMatt opened a new issue, #20313:
URL: https://github.com/apache/datafusion/issues/20313

   ### Describe the bug
   
   In row_hash.rs, there is a (potentially) massive unaccounted sort in the 
spill() function
   We never verify that we have this memory to perform this sort, which for 
something like a 1GB batch - which is very much a possibility, as we've run out 
of memory and are therefor spilling - will peak at about 2GB of memory usage.
   This can easily spiral out of control and end up with the operating system 
killing the process.
   
   ### To Reproduce
   
   Create a single Complete/Final AggregateExec.
   Provide a 1GB memory pool.
   Provide Data whose aggregate expressions will output > 1GB.
   You'll reach a spill, data will be sorted using sort_batch, causing a peak 
of > 2GB while both the input and output batches are alive.
   
   ### Expected behavior
   
   Ideally, checking if we can reserve memory for such a spill, if so, do it.
   If not, fallback to a lazy sorting stream that will only take batch_size at 
a time, causing longer memory holding of the original batch but lower peak.
   
   For now: erroring if the memory is unavailable.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to