alamb commented on code in PR #11627:
URL: https://github.com/apache/datafusion/pull/11627#discussion_r1694253923


##########
datafusion/physical-plan/src/aggregates/row_hash.rs:
##########
@@ -484,6 +612,12 @@ impl Stream for GroupedHashAggregateStream {
                         (
                             if self.input_done {
                                 ExecutionState::Done
+                            } else if self

Review Comment:
   nit is that putting this into a function (like 
`self.should_skip_aggregation()`) would make this logic easier to follow 



##########
datafusion/physical-plan/src/aggregates/row_hash.rs:
##########
@@ -90,6 +94,69 @@ struct SpillState {
     merging_group_by: PhysicalGroupBy,
 }
 
+struct SkipAggregationProbe {

Review Comment:
   FYI @kazuyukitanimura -- I wonder if you have time to review this change to 
hash aggregate spilling  as you originally contributed #7400 
   
   Context:
   * Describes this issue: https://github.com/apache/datafusion/issues/6937 
   * Additional background documentation: 
https://github.com/apache/datafusion/pull/11695



##########
datafusion/physical-expr-common/src/aggregate/groups_accumulator/prim_op.rs:
##########
@@ -134,6 +134,46 @@ where
         self.update_batch(values, group_indices, opt_filter, total_num_groups)
     }
 
+    fn convert_to_state(
+        &self,
+        values: &[ArrayRef],
+        opt_filter: Option<&BooleanArray>,
+    ) -> Result<Vec<ArrayRef>> {
+        let values = values[0].as_primitive::<T>();
+        let mut state = PrimitiveBuilder::<T>::with_capacity(values.len())
+            .with_data_type(self.data_type.clone());
+
+        match opt_filter {
+            Some(filter) => {
+                values

Review Comment:
   I think @Dandandan is suggesting using 
https://docs.rs/arrow/latest/arrow/compute/kernels/filter/fn.filter.html
   
   However, that would likely require a second copy of the values 
(apply/collect filter result and then apply `prim_fn`)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to