alamb opened a new issue, #9289:
URL: https://github.com/apache/arrow-datafusion/issues/9289

   ### Is your feature request related to a problem or challenge?
   
   As part of porting `arrow_cast` 
https://github.com/apache/arrow-datafusion/issues/9287 and `make_array` 
https://github.com/apache/arrow-datafusion/issues/9288, it has become clear 
that some functions have special simplification semantics:
   1. `arrow_cast` simplifies to `cast` of a particular data type. It is 
important for the rest of datafusion to understand the `Cast` semantics as the 
there are several special cases for Expr::Cast (such as 
[`unwrap_cast_in_comparison`](https://github.com/apache/arrow-datafusion/blob/ee9736fc81b4461d417b409baeaeb9f5e01ad962/datafusion/optimizer/src/unwrap_cast_in_comparison.rs#L40))
   2. `make_array` has special rewrite rules to combine / fold with 
`array_append` and `array_prepaend` in the [simplifier (source 
link)](https://github.com/apache/arrow-datafusion/blob/ee9736fc81b4461d417b409baeaeb9f5e01ad962/datafusion/optimizer/src/analyzer/rewrite_expr.rs#L135-L146)
   
   Also I think some systems may want to provide the ability to define UDFs 
that are more like macros (can be expressed in terms of other built in 
expressions), which needs some way for datafusion to "inline" the definition
   
   Similarly, specialized functions (e.g replace `regexp_match` with a version 
that had the regexp pre-compiled ...) - 
https://github.com/apache/arrow-datafusion/issues/8051 sound very similar
   
   
   ### Describe the solution you'd like
   
   I propose adding a function such as the following, we could implement the 
simplifications for `make_array` and `arrow_cast` in the UDF (and not in the 
main simplify code):
   
   
   ```rust
   /// Was the expression simplified?
   enum Simplified {
     /// The function call was simplified to an entirely new Expr
     Rewritten(Expr),
     /// the function call could not be simplified, and the arguments
     /// are return unmodified
     Original(Vec<Expr>)
   }
   ```
   
   
   ```rust
   pub trait ScalarUDFImpl {
   ...
   
     /// Apply any function specific simplification rules
     ///
     /// Some functions like arrow_cast have special semantic simplifications
     /// (into `Expr::Cast` for example) that can improve planning.
     ///
     /// If there is a simpler representation of calling this function with the
     /// specified arguments, return `Simplified::Rewritten` with the 
simplification.
     /// If no such simplification was possible, returns `Simplified::Original` 
with
     /// the unmodified arguments (the default)
     ///
     /// This function should only apply simplifications specific to this 
function.
     /// DataFusion will automatically simplify the arguments with a variety
     /// of rewrites during optimization
     fn simplify(&self, args: Vec<Expr>) -> Result<Simplified> {
       Ok(Simplified::Original(args)
     }
     ...
   ```
   }
   
   
   
   ### Describe alternatives you've considered
   
   There may be better 
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to