jorgecarleitao opened a new pull request #8144:
URL: https://github.com/apache/arrow/pull/8144
This functionality is relevant for the DataFrame API only.
Sometimes a UDF declaration happens during planning, and it is expressive
when the user can use it directly, without first register it in the execution
context's registry and accessing the registry to plan it.
This PR proposes that, given a UDF `pow: ScalarUDF` named `"pow"`, users can
(logically) plan it directly:
```rust
// plan a call
let expr = pow.call(vec![col("a"), col("b")]);
```
or register it (as before)
```rust
// register it
ctx.register_udf(pow);
```
or plan it from the registry (as before):
```rust
// access it from the registry
let pow = df.registry().udf("pow")?;
// plan a call
let expr = pow.call(vec![col("a"), col("b")]);
```
I changed the signature of the registry from `.udf(name, args) -> Expr` to
`.udf(name) -> ScalarUDF`, so that the API to call UDFs is the same regardless
of whether we take it from the registry or use it directly `call(args)`. IMO it
also makes it a bit more expressive that we are calling the `udf`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]