alamb commented on code in PR #6904:
URL: https://github.com/apache/arrow-datafusion/pull/6904#discussion_r1260048514
##########
datafusion/core/src/physical_plan/aggregates/row_hash.rs:
##########
@@ -306,460 +370,194 @@ impl RecordBatchStream for GroupedHashAggregateStream {
}
impl GroupedHashAggregateStream {
- // Update the row_aggr_state according to groub_by values (result of
group_by_expressions)
+ /// Calculates the group indicies for each input row of
+ /// `group_values`.
+ ///
+ /// At the return of this function,
+ /// `self.scratch_space.current_group_indices` has the same number
+ /// of entries as each array in `group_values` and holds the
+ /// correct group_index for that row.
+ ///
+ /// This is one of the core hot loops in the algorithm
fn update_group_state(
&mut self,
group_values: &[ArrayRef],
allocated: &mut usize,
- ) -> Result<Vec<usize>> {
+ ) -> Result<()> {
Review Comment:
I tried to encode this rationale via comment in 90f8730521
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]