Rachelint commented on code in PR #11943:
URL: https://github.com/apache/datafusion/pull/11943#discussion_r1730379309


##########
datafusion/functions-aggregate-common/src/aggregate/groups_accumulator/prim_op.rs:
##########
@@ -92,32 +101,69 @@ where
         opt_filter: Option<&BooleanArray>,
         total_num_groups: usize,
     ) -> Result<()> {
+        if total_num_groups == 0 {
+            return Ok(());
+        }
+
         assert_eq!(values.len(), 1, "single argument to update_batch");
         let values = values[0].as_primitive::<T>();
 
-        // update values
-        self.values.resize(total_num_groups, self.starting_value);
-
         // NullState dispatches / handles tracking nulls and groups that saw 
no values
-        self.null_state.accumulate(
-            group_indices,
-            values,
-            opt_filter,
-            total_num_groups,
-            |group_index, new_value| {
-                let value = &mut self.values[group_index];
-                (self.prim_fn)(value, new_value);
-            },
-        );
+        match self.mode {
+            GroupStatesMode::Flat => {
+                // Ensure enough room in values
+                ensure_enough_room_for_flat_values(
+                    &mut self.values_blocks,
+                    total_num_groups,
+                    self.starting_value,
+                );
+
+                let block = self.values_blocks.current_mut().unwrap();
+                self.null_state.accumulate_for_flat(
+                    group_indices,
+                    values,
+                    opt_filter,
+                    total_num_groups,
+                    |group_index, new_value| {
+                        let value = &mut block[group_index];
+                        (self.prim_fn)(value, new_value);
+                    },
+                );
+            }
+            GroupStatesMode::Blocked(blk_size) => {

Review Comment:
   @alamb I try to merge the two mode(flat and blocked) into one this weekend.
   
   But these two modes have the different parsing method for group index, if we 
eliminate branches, we will need to use `trait object` or `function pointer` to 
support this, and that will lead to dynamic dispatch for the row level 
operation, and finally hurt the performance...
   
   In my local
   - q32 + static dispatch, 6800+ms
   - q32 + dynmaic dispatch, 7100+ms
   - q22(can't support blocked emission) + static dispatch, 1980+ms
   - q22(can't support blocked emission) + static dispatch, 2050+ms
   
   Is it ok to keep some branches for these two modes? like:
   
https://github.com/Rachelint/arrow-datafusion/blob/4fc0a27848bf490f2612a1525407aafe2743076d/datafusion/functions-aggregate/src/count.rs#L394-L418
   
   What I am worried about is that this will lead to slight performance 
regression although we disabled the new feature(like q22)...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to