ozankabak commented on code in PR #4924:
URL: https://github.com/apache/arrow-datafusion/pull/4924#discussion_r1072282133


##########
datafusion/core/src/physical_plan/aggregates/row_hash.rs:
##########
@@ -70,29 +74,33 @@ use hashbrown::raw::RawTable;
 ///
 /// [Compact]: datafusion_row::layout::RowType::Compact
 /// [WordAligned]: datafusion_row::layout::RowType::WordAligned
-pub(crate) struct GroupedHashAggregateStreamV2 {
+pub(crate) struct GroupedHashAggregateStream {
     stream: BoxStream<'static, ArrowResult<RecordBatch>>,
     schema: SchemaRef,
 }
 
-/// Actual implementation of [`GroupedHashAggregateStreamV2`].
+/// Actual implementation of [`GroupedHashAggregateStream`].
 ///
 /// This is wrapped into yet another struct because we need to interact with 
the async memory management subsystem
 /// during poll. To have as little code "weirdness" as possible, we chose to 
just use [`BoxStream`] together with
-/// [`futures::stream::unfold`]. The latter requires a state object, which is 
[`GroupedHashAggregateStreamV2Inner`].
-struct GroupedHashAggregateStreamV2Inner {
+/// [`futures::stream::unfold`]. The latter requires a state object, which is 
[`GroupedHashAggregateStreamInner`].
+struct GroupedHashAggregateStreamInner {
     schema: SchemaRef,
     input: SendableRecordBatchStream,
     mode: AggregateMode,
-    aggr_state: AggregationState,
-    aggregate_expressions: Vec<Vec<Arc<dyn PhysicalExpr>>>,
+    normal_aggr_expr: Vec<Arc<dyn AggregateExpr>>,
+    row_aggr_state: RowAggregationState,
+    /// Aggregate expressions not supporting row accumulation
+    normal_aggregate_expressions: Vec<Vec<Arc<dyn PhysicalExpr>>>,

Review Comment:
   Yes, one operator but multiple accumulators. Having a single accumulator 
implementation without performance impact is a tough goal and there seems to be 
some open questions around it. For example, some accumulation tasks involve 
non-fixed states, how do we implement them in the row-accumulator style?
   
   Until we see a clear path forward in that regard, I think this is a good 
first step towards full unification (if such a thing is possible).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to