pepijnve commented on code in PR #19287:
URL: https://github.com/apache/datafusion/pull/19287#discussion_r2625116079


##########
datafusion/physical-plan/src/aggregates/row_hash.rs:
##########
@@ -550,26 +569,37 @@ impl GroupedHashAggregateStream {
             .collect::<Vec<_>>()
             .join(", ");
         let name = format!("GroupedHashAggregateStream[{partition}] 
({agg_fn_names})");
-        let reservation = MemoryConsumer::new(name)
-            .with_can_spill(true)
-            .register(context.memory_pool());
         let group_ordering = GroupOrdering::try_new(&agg.input_order_mode)?;
+        let oom_mode = match group_ordering {
+            GroupOrdering::None => {
+                if agg.mode == AggregateMode::Partial {
+                    OutOfMemoryMode::EmitEarly
+                } else {
+                    OutOfMemoryMode::Spill
+                }
+            }
+            GroupOrdering::Partial(_) | GroupOrdering::Full(_) => 
OutOfMemoryMode::Spill,

Review Comment:
   You're right that for full ordering it doesn't make much sense to spill to 
disk. The only possible situation where spilling would happen is indeed a 
single group with a larger than memory reservation allows aggregate value. If 
you spill to disk, you'll most likely encounter the exact same OOM problem when 
reading back the spilled data.
   I guess in that case we should just report the OOM error and abort instead 
(unless we're doing partial aggregation; in that case forced early emit is 
possible too).
   
   Note that the OOM behaviour only kicks in after first trying to emit groups 
that are known to be complete based on the GroupOrdering.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to