kosiew commented on code in PR #22639:
URL: https://github.com/apache/datafusion/pull/22639#discussion_r3432922097


##########
datafusion/functions-aggregate/src/average.rs:
##########
@@ -955,22 +1058,149 @@ where
             .as_primitive::<T>()
             .clone()
             .with_data_type(self.sum_data_type.clone());
-        let counts = UInt64Array::from_value(1, sums.len());
+        let counts = PrimitiveArray::<CountType>::from_value(
+            CountType::Native::usize_as(1),
+            sums.len(),
+        );
 
         let nulls = filtered_null_mask(opt_filter, &sums);
-
-        // set nulls on the arrays
         let counts = set_nulls(counts, nulls.clone());
         let sums = set_nulls(sums, nulls);
 
-        Ok(vec![Arc::new(counts) as ArrayRef, Arc::new(sums)])
+        Ok(Layout::state_arrays(sums, counts))
     }
 
     fn supports_convert_to_state(&self) -> bool {
         true
     }
 
     fn size(&self) -> usize {
-        self.counts.capacity() * size_of::<u64>() + self.sums.capacity() * 
size_of::<T>()
+        self.counts.capacity() * size_of::<CountType::Native>()
+            + self.sums.capacity() * size_of::<T::Native>()
+            + self.null_state.size()
+    }
+}
+
+/// Creates the Spark AVG grouped accumulator using the built-in AVG state
+/// handling implementation while preserving Spark's `[sum, count]` state order
+/// and signed count type.
+#[doc(hidden)]
+pub fn spark_avg_groups_accumulator(

Review Comment:
   I moved to a neutral shared abstraction that both built-in AVG and Spark AVG 
use.
   
   Changes:
   - `datafusion/functions-aggregate/src/average.rs`
     - Removed Spark-specific names/API from the built-in aggregate crate.
     - Replaced `SparkAvgLayout`, `SparkAvgGroupsAccumulator`, and 
`spark_avg_groups_accumulator(...)` with neutral shared pieces:
       - `AvgStateLayout`
       - `CountSumAvgStateLayout`
       - `SumCountAvgStateLayout`
       - `create_avg_groups_accumulator(...)`
   - `datafusion/spark/src/function/aggregate/avg.rs`
     - Spark now selects the neutral shared constructor with `Float64Type`, 
`Int64Type`, and `SumCountAvgStateLayout`.
     - Spark still owns its `[sum, count]` + signed-count contract at the call 
site.
   
   Result: shared AVG state/null/merge logic remains unified, but 
`functions-aggregate` no longer exposes Spark-named concepts.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to