DerGut commented on code in PR #15692: URL: https://github.com/apache/datafusion/pull/15692#discussion_r2041234887
########## datafusion/physical-plan/src/sorts/sort.rs: ########## @@ -765,6 +765,25 @@ impl ExternalSorter { Ok(()) } + + /// Reserves memory to be able to accommodate the given batch. + /// If memory is scarce, tries to spill current in-memory batches to disk first. + async fn reserve_memory_for_batch(&mut self, input: &RecordBatch) -> Result<()> { + let size = get_reserved_byte_for_record_batch(input); + + let result = self.reservation.try_grow(size); + if result.is_err() { + if self.in_mem_batches.is_empty() { + return result; Review Comment: This is great! Thanks a lot for the reference 🙏 I added some context with https://github.com/apache/datafusion/pull/15692/commits/d4e654c176b4eb0ce084029d181f91735d83f375 I dropped the detail about buffered batches because I figured that it's probably not relevant to the user. Regardless of whether in-memory batches could be spilled or not, the actionable info is to increase the memory limit. This also allowed me to reuse the same context in https://github.com/apache/datafusion/pull/15692/commits/a14e585c490372167814714dfcf6cf20b3d1726b (and https://github.com/apache/datafusion/pull/15692/commits/0ee1e8e7543fce169086f78d3ef3186c87bdf810 for completeness). This means all failing memory reservations in the external sorter are wrapped with context. Let me know if you don't think these are necessary. I also added a suggestion to decrease `sort_spill_reservation_bytes` instead. But I'm not convinced this is good advice to the user. I can leave it as a comment instead 🤷 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org