mingmwang commented on code in PR #5904:
URL: https://github.com/apache/arrow-datafusion/pull/5904#discussion_r1160414255
##########
datafusion/core/src/physical_plan/aggregates/row_hash.rs:
##########
@@ -514,6 +580,19 @@ impl GroupedHashAggregateStream {
}
}
+fn get_accumulator_set_size(
+ groups_with_rows: &[usize],
+ row_group_states: &[RowGroupState],
+) -> usize {
+ groups_with_rows.iter().fold(0usize, |acc, group_idx| {
+ let group_state = &row_group_states[*group_idx];
+ group_state
+ .accumulator_set
+ .iter()
+ .fold(acc, |acc, accumulator| acc + accumulator.size())
+ })
+}
+
/// The state that is built for each output group.
Review Comment:
I think the logic is quite complex to collect the memory size of the
accumulators, maybe the computation is more than the real useful aggregations
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]