Re: [PR] Implement disk spilling for all grouping ordering modes in GroupedHashAggregateStream [datafusion]

via GitHub Tue, 16 Dec 2025 14:43:06 -0800


rluvaton commented on code in PR #19287:
URL: https://github.com/apache/datafusion/pull/19287#discussion_r2624991342



##########
datafusion/physical-plan/src/aggregates/row_hash.rs:
##########
@@ -550,26 +569,37 @@ impl GroupedHashAggregateStream {
             .collect::<Vec<_>>()
             .join(", ");
         let name = format!("GroupedHashAggregateStream[{partition}] 
({agg_fn_names})");
-        let reservation = MemoryConsumer::new(name)
-            .with_can_spill(true)
-            .register(context.memory_pool());
         let group_ordering = GroupOrdering::try_new(&agg.input_order_mode)?;
+        let oom_mode = match group_ordering {
+            GroupOrdering::None => {
+                if agg.mode == AggregateMode::Partial {
+                    OutOfMemoryMode::EmitEarly
+                } else {
+                    OutOfMemoryMode::Spill
+                }
+            }
+            GroupOrdering::Partial(_) | GroupOrdering::Full(_) => 
OutOfMemoryMode::Spill,

Review Comment:
   I don't think you should support the regular spill in group ordering full.
   
   as there are 2 cases where you need spill:
   1. too many groups
   2. aggregate expression takes too much space and we didn't finish the first 
group (as if you have more than 1 group this is case 1)
   
   in the first case, emitting early will solve the problem and you can do that 
both in partial and final as you know that you won't get more values for that 
group
   
   in the second case, regular spilling is not needed as it involves doing 
multi level merge sort which will try to sort a single key which is useless.
   
   instead when ordering is full, you just read them back and merge as you go, 
and if you don't have enough memory while doing merging, than we can spill and 
continue with the merge that include that new spill file (make sure to bail to 
avoid infinite loop of spill and read)
   
   not have to be in this pr.
   
   
   If I'm not mistaken currently when there is a spill it will wait for all 
data from all groups to be received and spill them and then will move to the 
merge mode. which for group ordering full is not needed, once we finished the 
group we can read from spill files and do the merge on that group only, so it 
combine spill and emit early
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Implement disk spilling for all grouping ordering modes in GroupedHashAggregateStream [datafusion]

Reply via email to