korowa commented on code in PR #11627: URL: https://github.com/apache/datafusion/pull/11627#discussion_r1694260145
########## datafusion/physical-expr-common/src/aggregate/groups_accumulator/prim_op.rs: ########## @@ -134,6 +134,46 @@ where self.update_batch(values, group_indices, opt_filter, total_num_groups) } + fn convert_to_state( + &self, + values: &[ArrayRef], + opt_filter: Option<&BooleanArray>, + ) -> Result<Vec<ArrayRef>> { + let values = values[0].as_primitive::<T>(); + let mut state = PrimitiveBuilder::<T>::with_capacity(values.len()) + .with_data_type(self.data_type.clone()); + + match opt_filter { + Some(filter) => { + values Review Comment: Yes, but the example shows that it filters out values from the source array, and conversion to state must produce the same number of elements, just placing nulls instead of filtered values, so I'm planning to look for smth like "apply null mask". I've started with some benchmarks (criterion based ones) and they show that current code for nullable columns (at least for count) is significantly slower that for non nullable ones (~15 times :disappointed: ), probably some part of this time can be recovered. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org