jorgecarleitao commented on a change in pull request #7971:
URL: https://github.com/apache/arrow/pull/7971#discussion_r471899766



##########
File path: rust/datafusion/src/execution/physical_plan/udf.rs
##########
@@ -146,3 +154,99 @@ impl PhysicalExpr for ScalarFunctionExpr {
         (fun)(&inputs)
     }
 }
+
+/// A generic aggregate function
+/*
+An aggregate function accepts an arbitrary number of arguments, of arbitrary 
data types,
+and returns an arbitrary type based on the incoming types.
+
+It is the developer of the function's responsibility to ensure that the 
aggregator correctly handles the different
+types that are presented to them, and that the return type correctly matches 
the type returned by the
+aggregator.
+
+It is the user of the function's responsibility to pass arguments to the 
function that have valid types.
+*/
+#[derive(Clone)]
+pub struct AggregateFunction {
+    /// Function name
+    pub name: String,
+    /// A list of arguments and their respective types. A function can accept 
more than one type as argument
+    /// (e.g. sum(i8), sum(u8)).
+    pub arg_types: Vec<Vec<DataType>>,
+    /// Return type. This function takes
+    pub return_type: ReturnType,

Review comment:
       This change and is under discussion in the mailing list.
   
   Essentially, the question is whether we should accept UDFs to have an 
input-dependent type or not (should this be a function or a DataType).
   
   If we decide to not accept input-dependent types, then UDFs are simpler 
(multiple input types, single output type), but we can't re-write our 
aggregates as UDFs
   
   If we decide to accept input-dependent types, then UDFs are more complex 
(multiple input types, multiple output type), but we can uniformise them all in 
a single interface in our code.
   
   We can also do something in the middle, on which we declare an interface for 
functions in our end that support (multiple input types, multiple output type), 
but only expose public interfaces to register (multiple input types, single 
output type) UDFs.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to