alamb commented on code in PR #6932:
URL: https://github.com/apache/arrow-datafusion/pull/6932#discussion_r1267842060
##########
datafusion/execution/src/memory_pool/mod.rs:
##########
@@ -221,6 +219,49 @@ pub fn human_readable_size(size: usize) -> String {
format!("{value:.1} {unit}")
}
+/// Tracks the change in memory to avoid overflow. Typically, this
+/// is isued like the following
+///
+/// 1. Call `delta.dec(sized_thing.size())`
+///
+/// 2. potentially change size of `sized_thing`
+///
+/// 3. Call `delta.inc(size_thing.size())`
+#[derive(Debug, Default)]
+pub struct MemoryDelta {
Review Comment:
However, now that you mention it the delta accounting is now done
per-`Accumulator` in the adapter so the actual group by hash operator can
probably remove the delta accounting
Here is my proposal:
1. I will add some comments on the rationale for delta accounting to this PR
2. I merge this MR
3. You can either remove the delta accounting in
https://github.com/apache/arrow-datafusion/pull/7016 in the main GroupByHash
operator or I will do so and we can run some benchmarks to make sure it doesn't
have performance impact
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]