DerGut commented on code in PR #15692: URL: https://github.com/apache/datafusion/pull/15692#discussion_r2041234887
########## datafusion/physical-plan/src/sorts/sort.rs: ########## @@ -765,6 +765,25 @@ impl ExternalSorter { Ok(()) } + + /// Reserves memory to be able to accommodate the given batch. + /// If memory is scarce, tries to spill current in-memory batches to disk first. + async fn reserve_memory_for_batch(&mut self, input: &RecordBatch) -> Result<()> { + let size = get_reserved_byte_for_record_batch(input); + + let result = self.reservation.try_grow(size); + if result.is_err() { + if self.in_mem_batches.is_empty() { + return result; Review Comment: This is great! Thanks a lot for the reference 🙏 I added some context with https://github.com/apache/datafusion/pull/15692/commits/d4e654c176b4eb0ce084029d181f91735d83f375 I dropped the detail about buffered batches because I figured that it's probably not relevant to the user. Regardless of whether in-memory batches could be spilled or not, the actionable info is to increase the memory limit. This also allowed me to reuse the same context in https://github.com/apache/datafusion/pull/15692/commits/a14e585c490372167814714dfcf6cf20b3d1726b (let me know if you don't think this is necessary). I also added a suggestion to decrease `sort_spill_reservation_bytes` instead. But I'm not convinced this is good advice to the user. I can leave it as a comment instead 🤷 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org