EmilyMatt opened a new issue, #19679:
URL: https://github.com/apache/datafusion/issues/19679

   ### Is your feature request related to a problem or challenge?
   
   In ExternalSorter, when a batch is inserted, we are unable to return the 
memory required to sort it, and there are no in_mem_batches to spill, we return 
an error, essentially quitting the query.
   For huge batches, like the ones returned from an Aggregate stream, this can 
easily cause OOMs  despite a lot of memory being available, this severely hurts 
the resilience of the SortExec.
   
   ### Describe the solution you'd like
   
   If the batch is equal or smaller than batch_size, we should return an error.
   But otherwise, a much better solution to do would be to modify the sorted 
spilling stream to hold the original batch, and use something similar to the 
logic in sort_batch_chunked, so every time we only need to reserve the extra 
memory of a single sliced batch, while the memory of the original batch will be 
held for longer, we will never request the huge amount of memory required to 
sort the whole thing, instead only holding the original batch's memory, and 
each time an extra memory of the sliced batch we are sorting, which is dropped 
immediately on writing.
   This will require minimal changes, as we already use a stream with a 
reservation, that reservation just needs to hold the original batch's 
reservation as well, which will be release when the stream is dropped.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to