2010YOUY01 commented on PR #7978:
URL: 
https://github.com/apache/arrow-datafusion/pull/7978#issuecomment-1788433924

   > Thank you @2010YOUY01 -- this is looking like a great first start.
   > 
   > One thing that came to mind when I was reviewing this PR was what about 
simply moving `BuiltInScalarFunction`s entirely into the `physical_expr` crate 
(and remove `Expr::ScalarFunction`). We would have to extend ScalarUDF... but 
that would have the nice property of keeping things uniform
   
   Thank you for your review. 
   
   For the `impl ScalarFunctionDef` is split in two crate issue, here are my 
thoughts:
   ### Motivation
   I think it's better to `impl ScalarFunctionDef` at logical expression level, 
to mimic all buitlin functions are defined outside the core, then they can go 
through planning, logical plan optimization, and all the way to execution. This 
way we can make sure the new interface is complete and replace all execution 
code.
   
   ### `impl ScalarFunctionDef` fully in `phys-expr` crate
   If we move `impl ScalarFunctionDef for BuiltinScalarFunction` into 
`phys-expr` crate, then we can only replace the last half of execution code 
(initial physical plan -> physical plan optimization -> execution) with the new 
interface.
   This way we have to keep two `Expr` nodes existing at the same time
   ```rust
   // Expr node for scalar functions
   #[derive(Clone, PartialEq, Eq, Hash, Debug)]
   pub struct ScalarFunction {
       pub fun: built_in_function::BuiltinScalarFunction,
       pub args: Vec<Expr>,
   }
   
   #[derive(Clone, PartialEq, Eq, Hash, Debug)]
   pub struct ScalarFunction2 {
       pub fun: Arc<dyn ScalarFunctionDef>,
       pub args: Vec<Expr>,
   }
   ```
   Then the execution would be
   {Original BuiltinScalarFunctions} --(planning)--> `Expr::ScalarFunction` --> 
`dyn ScalarFunctionDef` --> PhysicalPlanNode
   {moved-out-function-packages, UDFs} --(planning)--> `Expr::ScalarFunction2` 
--> `dyn ScalarFunctionDef` --> PhysicalPlanNode
   This way we can only replace the function execution code for physical plan, 
logical plan code can only be removed after all functions are moved to the new 
interface. So the physical plan part of the code can be reused, but there will 
be two separate code path for logical planning.
   (But this PR showed the new interface is able to cover full functionalities, 
if the above-mentioned issues are acceptable, we can go this way)
   
   ### `impl ScalarFunctionDef` split in two places
   1. We can replace all SQL execution code at first
   2. Only keep one `Expr` node 
   ```rust
   // Expr node for scalar functions
   #[derive(Clone, PartialEq, Eq, Hash, Debug)]
   pub struct ScalarFunction {
       pub fun: Arc<dyn ScalarFunctionDef>, // modified
       pub args: Vec<Expr>,
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to