alamb commented on code in PR #8183:
URL: https://github.com/apache/arrow-datafusion/pull/8183#discussion_r1394805720
##########
datafusion/expr/src/function.rs:
##########
@@ -36,10 +36,32 @@ use strum::IntoEnumIterator;
pub type ScalarFunctionImplementation =
Arc<dyn Fn(&[ColumnarValue]) -> Result<ColumnarValue> + Send + Sync>;
+/// Constant argument, (arg index, constant value).
+pub type ConstantArg = (usize, ScalarValue);
+
+/// Factory that returns the functions's return type given the input argument
types and constant arguments
+pub trait ReturnTypeFactory: Send + Sync {
+ fn infer(
+ &self,
+ input_types: &[DataType],
+ constant_args: &[ConstantArg],
+ ) -> Result<Arc<DataType>>;
+}
+
/// Factory that returns the functions's return type given the input argument
types
pub type ReturnTypeFunction =
Review Comment:
I wonder if you considered changing the `ReturnTypeFunction` to something
more like
```
Arc<dyn Fn(&[Expr], &dyn ExprSchemable) -> Result<Arc<DataType>> + Send
+ Sync>;
```
Now that we have made the fields of `ScalarUDF` non pub (in
https://github.com/apache/arrow-datafusion/pull/8079) I think we have much
greater leeway to improve the API
##########
datafusion/expr/src/function.rs:
##########
@@ -36,10 +36,32 @@ use strum::IntoEnumIterator;
pub type ScalarFunctionImplementation =
Arc<dyn Fn(&[ColumnarValue]) -> Result<ColumnarValue> + Send + Sync>;
+/// Constant argument, (arg index, constant value).
+pub type ConstantArg = (usize, ScalarValue);
+
+/// Factory that returns the functions's return type given the input argument
types and constant arguments
+pub trait ReturnTypeFactory: Send + Sync {
+ fn infer(
+ &self,
+ input_types: &[DataType],
+ constant_args: &[ConstantArg],
+ ) -> Result<Arc<DataType>>;
+}
+
/// Factory that returns the functions's return type given the input argument
types
pub type ReturnTypeFunction =
Review Comment:
Perhaps we could introduce the `FunctionImplementation` trait as proposed
here:
https://github.com/apache/arrow-datafusion/pull/8046/files#diff-8a327db2db945bcf6ca2b4229885532feae127e94a450600d3fac6ecdc0eeb3fR141
with the more general return types. Something like this perhaps:
```rust
/// Convenience trait for implementing ScalarUDF. See
[`ScalarUDF::new_from_impl()`]
pub trait FunctionImplementation {
/// Returns this function's name
fn name(&self) -> &str;
/// Returns this function's signature
fn signature(&self) -> &Signature;
/// return the return type of this function given the types of the
arguments
fn return_type(&self, arg_types: &[DataType]) -> Result<DataType>;
/// return the return type of this function given the actual arguments.
This is
/// used to implement functions where the return type is a function of
the actual
/// arguments
fn return_type_from_args(&self, args: &[Expr], schemabe: &dyn
ExprSchemable) -> Result<DataType> {
// default impl would call `Self::return_type`
todo!()
}
/// Invoke the function on `args`, returning the appropriate result
fn invoke(&self, args: &[ColumnarValue]) -> Result<ColumnarValue>;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]