comphead commented on PR #18921:
URL: https://github.com/apache/datafusion/pull/18921#issuecomment-3946052171
Thanks @rluvaton and @gstvg , its nice you mentioned `array_transform`, the
tricky part for this function is its return type depends on lambda
```
array_transform(array<T>, function<T, U>) -> array<U>
```
> I want to keep the simplicity of ScalarUDF which means that in order to
evaluate a lambda expression I don't need to construct stuff, only need to
provide the input and maybe some options for future use.
Right, on high level it could be like
```
pub struct LambdaExpr {
/// Parameter names/types already resolved
pub param_types: Vec<DataType>,
/// Expression body, what needs to be evaluated, this thing potentially
can be UDF
pub body: Arc<dyn PhysicalExpr>,
}
```
Impl
```
impl LambdaExpr {
pub fn new(
param_types: Vec<DataType>,
body: Arc<dyn PhysicalExpr>,
) -> Self {
Self { param_types, body }
}
/// Evaluate lambda over provided arrays
pub fn evaluate_with_args(
&self,
args: Vec<ArrayRef>,
) -> Result<ArrayRef> {
// Build synthetic schema
let fields: Vec<Field> = self.param_types
.iter()
.enumerate()
.map(|(i, dt)| Field::new(format!("arg{}", i), dt.clone(), true))
.collect();
let schema = Arc::new(Schema::new(fields));
let batch = RecordBatch::try_new(schema, args)?;
self.body.evaluate(&batch) // this where our UDF would be called
}
}
```
So for example `x -> x + 1` we need to parse expression and create our
Lambda, so we need to modify parser to get structures below from user defined
code and there is an existing ticket
https://github.com/apache/datafusion-sqlparser-rs/issues/1273
```
// Parameter x at column 0
let x = Arc::new(ColumnExpr::new(0));
// Literal 1
let one = Arc::new(LiteralExpr::new(
ScalarValue::Int32(Some(1))
));
// x + 1
let body = Arc::new(BinaryExpr::new(
x,
one,
Operator::Add,
));
// Lambda(x) -> x + 1
let lambda = LambdaExpr::new(
vec![DataType::Int32],
body,
);
```
and call it from caller built in function
```
fn array_transform(
list_array: &ListArray,
lambda: &LambdaExpr,
) -> Result<ListArray> {
let values = list_array.values().clone();
// evaluate lambda on flattened child array
let transformed =
lambda.evaluate_with_args(vec![values])?;
Ok(ListArray::new(
list_array.data_type().clone(),
list_array.offsets().clone(),
transformed,
list_array.nulls().cloned(),
))
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]