DerGut commented on code in PR #15692:
URL: https://github.com/apache/datafusion/pull/15692#discussion_r2041234887


##########
datafusion/physical-plan/src/sorts/sort.rs:
##########
@@ -765,6 +765,25 @@ impl ExternalSorter {
 
         Ok(())
     }
+
+    /// Reserves memory to be able to accommodate the given batch.
+    /// If memory is scarce, tries to spill current in-memory batches to disk 
first.
+    async fn reserve_memory_for_batch(&mut self, input: &RecordBatch) -> 
Result<()> {
+        let size = get_reserved_byte_for_record_batch(input);
+
+        let result = self.reservation.try_grow(size);
+        if result.is_err() {
+            if self.in_mem_batches.is_empty() {
+                return result;

Review Comment:
   This is great! Thanks a lot for the reference 🙏
   
   I added some context with 
https://github.com/apache/datafusion/pull/15692/commits/d4e654c176b4eb0ce084029d181f91735d83f375
   
   I dropped the detail about buffered batches because I figured that it's 
probably not relevant to the user. Regardless of whether in-memory batches 
could be spilled or not, the actionable info is to increase the memory limit. 
This also allowed me to reuse the same context in 
https://github.com/apache/datafusion/pull/15692/commits/a14e585c490372167814714dfcf6cf20b3d1726b
 (let me know if you don't think this is necessary).
   
   I also added a suggestion to decrease `sort_spill_reservation_bytes` 
instead. But I'm not convinced this is good advice to the user. I can leave it 
as a comment instead 🤷 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to