universalmind303 commented on code in PR #8985:
URL: https://github.com/apache/arrow-datafusion/pull/8985#discussion_r1483844098


##########
datafusion/expr/src/udf.rs:
##########
@@ -249,6 +261,22 @@ pub trait ScalarUDFImpl: Debug + Send + Sync {
     /// the arguments
     fn return_type(&self, arg_types: &[DataType]) -> Result<DataType>;
 
+    /// What [`DataType`] will be returned by this function, given the types of
+    /// the expr arguments
+    fn return_type_from_exprs(

Review Comment:
   Since we need to use the `ExprSchema` here, but there's issues going from 
`<S: ExprSchema>` to `&dyn ExprSchema` inside the trait, what if we moved up 
the default implementation out of the trait and into `ScalarUDF`? 
   
   we could change the trait impl to something like this
   
   ```rs
   pub trait ScalarUDFImpl: Debug + Send + Sync {
       /// What [`DataType`] will be returned by this function, given the types 
of
       /// the expr arguments
       fn return_type_from_exprs(
           &self,
           arg_exprs: &[Expr],
           schema: &dyn ExprSchema,
       ) -> Option<Result<DataType>> {
           // The default implementation returns None
           // so that people don't have to implement `return_type_from_exprs` 
if they dont want to
           None
       }
   }
   ```
   then change the `ScalarUDF` impl to this
   ```rs
   impl ScalarUDF
       /// The datatype this function returns given the input argument input 
types.
       /// This function is used when the input arguments are [`Expr`]s.
       /// See [`ScalarUDFImpl::return_type_from_exprs`] for more details.
       pub fn return_type_from_exprs<S: ExprSchema>(
           &self,
           args: &[Expr],
           schema: &S,
       ) -> Result<DataType> {
           // If the implementation provides a return_type_from_exprs, use it
           if let Some(return_type) = self.inner.return_type_from_exprs(args, 
schema) {
               return_type
           // Otherwise, use the return_type function
           } else {
               let arg_types = args
                   .iter()
                   .map(|arg| arg.get_type(schema))
                   .collect::<Result<Vec<_>>>()?;
               self.return_type(&arg_types)
           }
       }
   }
   ```
   
   this way we don't need to constrain the `ExprSchemable` functions to 
`?Sized`, and we can update the `get_type` function to use the 
`return_type_from_exprs` without any compile time errors. 
   
   ```rs
   ScalarFunctionDefinition::UDF(fun) => {
       Ok(fun.return_type_from_exprs(&args, schema)?)
   }
   ```
   
   and it still makes `return_type_from_exprs` an opt-in method. 
    



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to