Jefffrey commented on code in PR #22639:
URL: https://github.com/apache/datafusion/pull/22639#discussion_r3425516398


##########
datafusion/functions-aggregate/src/average.rs:
##########
@@ -788,14 +786,99 @@ where
 
     /// Function that computes the final average (value / count)
     avg_fn: F,
+
+    /// AVG state layout marker.
+    layout: PhantomData<Layout>,
+}
+
+#[derive(Debug)]
+struct BuiltInAvgLayout;

Review Comment:
   I don't know if the layout of state fields differing between the 
implementations is actually significant 🤔 



##########
datafusion/functions-aggregate/src/average.rs:
##########
@@ -955,22 +1058,149 @@ where
             .as_primitive::<T>()
             .clone()
             .with_data_type(self.sum_data_type.clone());
-        let counts = UInt64Array::from_value(1, sums.len());
+        let counts = PrimitiveArray::<CountType>::from_value(
+            CountType::Native::usize_as(1),
+            sums.len(),
+        );
 
         let nulls = filtered_null_mask(opt_filter, &sums);
-
-        // set nulls on the arrays
         let counts = set_nulls(counts, nulls.clone());
         let sums = set_nulls(sums, nulls);
 
-        Ok(vec![Arc::new(counts) as ArrayRef, Arc::new(sums)])
+        Ok(Layout::state_arrays(sums, counts))
     }
 
     fn supports_convert_to_state(&self) -> bool {
         true
     }
 
     fn size(&self) -> usize {
-        self.counts.capacity() * size_of::<u64>() + self.sums.capacity() * 
size_of::<T>()
+        self.counts.capacity() * size_of::<CountType::Native>()
+            + self.sums.capacity() * size_of::<T::Native>()
+            + self.null_state.size()
+    }
+}
+
+/// Creates the Spark AVG grouped accumulator using the built-in AVG state
+/// handling implementation while preserving Spark's `[sum, count]` state order
+/// and signed count type.
+#[doc(hidden)]
+pub fn spark_avg_groups_accumulator(

Review Comment:
   I'm personally not sure about pulling in Spark specific code to 
functions-aggregate 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to