[ 
https://issues.apache.org/jira/browse/ARROW-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle McCarthy updated ARROW-6947:
---------------------------------
    Comment: was deleted

(was: I am curious to see if you have any ideas about how this would work. I 
have been working on a PoC, but will probably need to make some design 
decisions and would like to see if they align with yours.

At a high level, I see this working by composing a UDF with some general 
ScalarFunction type. Right now I have the ScalarFunction with type: 
{code:java}
Box<dyn Fn(Vec<ScalarValue>) -> Result<ScalarValue>{code}
so if a users defines a function such as
{code:java}
fn length(s: String) -> usize{code}
we would wrap that and return our ScalarFunction.

I think that the composed functions need to be associated with some "static" 
metadata, similar to the FunctionMeta in the logical plan. I think we would 
want to know the DataType of the arguments that the function expects and if 
they are optional, as well as the return type and if it is fallible/infallible.

If the UDF accepts and returns primitive rust types, generating that meta data 
should be pretty straight forward. However, if the UDF takes/returns 
ScalarValues the user would have to specifically provide the metadata.

We would be able to generate most of the data for the logical plan's 
FunctionMeta but would still need the function name and the field names for the 
args.

As of right now, I haven't done anything related to Aggregate UDFs or actually 
registering them with the ExecutionContext. )

> [Rust] [DataFusion] Add support for scalar UDFs
> -----------------------------------------------
>
>                 Key: ARROW-6947
>                 URL: https://issues.apache.org/jira/browse/ARROW-6947
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Rust, Rust - DataFusion
>            Reporter: Andy Grove
>            Assignee: Andy Grove
>            Priority: Major
>
> As a user, I would like to be able to define my own functions and then use 
> them in SQL statements.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to