[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4924: Unify Row hash and hash implementation

GitBox Wed, 18 Jan 2023 06:04:35 -0800


alamb commented on code in PR #4924:
URL: https://github.com/apache/arrow-datafusion/pull/4924#discussion_r1073568526



##########
datafusion/core/src/physical_plan/aggregates/row_hash.rs:
##########
@@ -101,6 +109,10 @@ struct GroupedHashAggregateStreamV2Inner {
     /// if the result is chunked into batches,
     /// last offset is preserved for continuation.
     row_group_skip_position: usize,
+    /// keeps range for each accumulator in the field
+    /// first element in the array corresponds to normal accumulators
+    /// second element in the array corresponds to row accumulators
+    indices: [Vec<Range<usize>>; 2],

Review Comment:
   The need to keep two lists of accumulators is quite unfortunate (maybe the 
code would be simpler if it were in a single Enum or behind a trait). However, 
I think this implementation is better than what we have on the master branch 
because it at least only has duplication with the aggregators rather than the 
entire GroupHash structure



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4924: Unify Row hash and hash implementation

Reply via email to