[GitHub] [arrow-datafusion] mingmwang commented on a diff in pull request #5904: Refine the size() calculation of accumulator

via GitHub Thu, 06 Apr 2023 20:46:16 -0700


mingmwang commented on code in PR #5904:
URL: https://github.com/apache/arrow-datafusion/pull/5904#discussion_r1160414255



##########
datafusion/core/src/physical_plan/aggregates/row_hash.rs:
##########
@@ -514,6 +580,19 @@ impl GroupedHashAggregateStream {
     }
 }
 
+fn get_accumulator_set_size(
+    groups_with_rows: &[usize],
+    row_group_states: &[RowGroupState],
+) -> usize {
+    groups_with_rows.iter().fold(0usize, |acc, group_idx| {
+        let group_state = &row_group_states[*group_idx];
+        group_state
+            .accumulator_set
+            .iter()
+            .fold(acc, |acc, accumulator| acc + accumulator.size())
+    })
+}
+
 /// The state that is built for each output group.

Review Comment:
   I think the logic is quite complex to collect the memory size of the 
accumulators, maybe the computation is more than the real useful aggregations



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] mingmwang commented on a diff in pull request #5904: Refine the size() calculation of accumulator

Reply via email to