jorgecarleitao opened a new pull request #8144:
URL: https://github.com/apache/arrow/pull/8144


   This functionality is relevant for the DataFrame API only.
   
   Sometimes a UDF declaration happens during planning, and it is expressive 
when the user can use it directly, without first register it in the execution 
context's registry and accessing the registry to plan it.
   
   This PR proposes that, given a UDF `pow: ScalarUDF` named `"pow"`, users can 
(logically) plan it directly:
   
   ```rust
   // plan a call
   let expr = pow.call(vec![col("a"), col("b")]);
   ```
   
   or register it (as before)
   
   ```rust
   // register it
   ctx.register_udf(pow);
   ```
   
   or plan it from the registry (as before):
   
   ```rust
   // access it from the registry
   let pow = df.registry().udf("pow")?;
   
   // plan a call
   let expr = pow.call(vec![col("a"), col("b")]);
   ```
   
   I changed the signature of the registry from `.udf(name, args) -> Expr` to 
`.udf(name) -> ScalarUDF`, so that the API to call UDFs is the same regardless 
of whether we take it from the registry or use it directly `call(args)`. IMO it 
also makes it a bit more expressive that we are calling the `udf`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to