gstvg commented on code in PR #18921:
URL: https://github.com/apache/datafusion/pull/18921#discussion_r2898017709


##########
datafusion/substrait/src/logical_plan/producer/expr/mod.rs:
##########
@@ -152,6 +152,9 @@ pub fn to_substrait_rex(
             not_impl_err!("Cannot convert {expr:?} to Substrait")
         }
         Expr::Unnest(expr) => not_impl_err!("Cannot convert {expr:?} to 
Substrait"),
+        Expr::LambdaFunction(expr) => producer.handle_lambda_function(expr, 
schema),
+        Expr::Lambda(expr) => not_impl_err!("Cannot convert {expr:?} to 
Substrait"), // not yet implemented in substrait-rs
+        Expr::LambdaVariable(expr) => not_impl_err!("Cannot convert {expr:?} 
to Substrait"), // not yet implemented in substrait-rs

Review Comment:
   Thanks @LiaCastaneda for the info and @benbellick for the release
   > I wonder if the expressions can actually be mapped 1:1
   
   I did a quick check and I believe it's possible, yes. 
   
   `LambdaFunction` produces a regular substrait `ScalarFunction`, and to 
consume we just check whether any `LambdaUDF` exists with the given function 
name and/or if any arg is a lambda.
   
   To produce a substrait 
[Lambda](https://substrait.io/expressions/lambda_expressions/#lambda-expression-structure)
 with the field `parameters` struct where each field corresponds to a 
parameter, we use the return of `LambdaUDF::lambdas_parameters`, and for 
consuming, we can use default parameters names like `p1`, `p2` etc
    
   To produce a 
[FieldReference](https://substrait.io/expressions/field_references/)/[LambdaParameterReference](https://substrait.io/expressions/lambda_expressions/#parameter-references)
 with a correct `steps_out` and `direct_reference.struct_field.field`, we keep 
a `HashMap<String, (usize, usize)>` of `lambda_parameter_name => (step_outs, 
struct_field_index)`, that get's updated for every `Lambda` to be produced: 
`step_outs` is reset to 0 for shadowed parameters, others incremented by one, 
and `struct_field_index` set according to it's position within the parameters 
of the lambda it originates from. Consuming into `LambdaVariable` with it's 
field resolved is similar to SQL planning: keep a `HashMap<usize, FieldRef>` of 
`direct_reference.struct_field.field => field` that get's updated for every 
`LambdaUDF` by the return of it's `lambdas_parameters` method 
   
   I'll address your another comments and then implement substrait support on 
another branch to make sure is viable



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to