[
https://issues.apache.org/jira/browse/ARROW-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jorge updated ARROW-9836:
-------------------------
Description:
TL;DR; currently, users call UDFs through
{color:#000000}df.select(scalar_functions(“sqrt”, vec![col(“a”)],
DataType::Float64)){color}
Proposal:
{color:#000000}let f = df.registry();{color}
{color:#000000}df.select(f.udf(“sqrt”, vec![col(“a”)])?){color}
so that they do not have to remember the UDFs return type when using it.
This API will in the future allow to declare the UDF as part of the planning,
like spark, instead of having to register it in the registry before using it
(we just need to check if the UDF is registered or not before doing so).
See complete proposal here:
[https://docs.google.com/document/d/1Kzz642ScizeKXmVE1bBlbLvR663BKQaGqVIyy9cAscY/edit?usp=sharing]
was:
TL;DR; currently, users call UDFs through
{color:#000000}df.select(scalar_functions(“sqrt”, vec![col(“a”)],
DataType::Float64)){color}
Proposal:
{color:#000000}let udf = df.registry()?;{color}
{color:#000000}df.select(udf(“sqrt”, vec![col(“a”)])?){color}
so that they do not have to remember the UDFs return type when using it.
This API will in the future allow to declare the UDF as part of the planning,
like spark, instead of having to register it in the registry before using it
(we just need to check if the UDF is registered or not before doing so).
See complete proposal here:
[https://docs.google.com/document/d/1Kzz642ScizeKXmVE1bBlbLvR663BKQaGqVIyy9cAscY/edit?usp=sharing]
> [Rust] [DataFusion] Improve API for usage of UDFs
> -------------------------------------------------
>
> Key: ARROW-9836
> URL: https://issues.apache.org/jira/browse/ARROW-9836
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Rust, Rust - DataFusion
> Reporter: Jorge
> Priority: Major
> Labels: pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> TL;DR; currently, users call UDFs through
>
> {color:#000000}df.select(scalar_functions(“sqrt”, vec![col(“a”)],
> DataType::Float64)){color}
>
> Proposal:
>
> {color:#000000}let f = df.registry();{color}
> {color:#000000}df.select(f.udf(“sqrt”, vec![col(“a”)])?){color}
>
> so that they do not have to remember the UDFs return type when using it.
>
> This API will in the future allow to declare the UDF as part of the
> planning, like spark, instead of having to register it in the registry before
> using it (we just need to check if the UDF is registered or not before doing
> so).
> See complete proposal here:
> [https://docs.google.com/document/d/1Kzz642ScizeKXmVE1bBlbLvR663BKQaGqVIyy9cAscY/edit?usp=sharing]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)