[ 
https://issues.apache.org/jira/browse/ARROW-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033194#comment-17033194
 ] 

Jorge commented on ARROW-6947:
------------------------------

I finished a first draft of this that I am minimally happy with, see 
[https://github.com/jorgecarleitao/arrow/commit/0c77ff531cee191712d08633080de9caed786b62]
 . Not all function signatures are implemented on purpose, as I see we need to 
take a design decision with respect to them.

The API is used as shown in [this 
test|https://github.com/jorgecarleitao/arrow/commit/0c77ff531cee191712d08633080de9caed786b62#diff-8273e76b6910baa123f3a25a967af3b5R572]:
{code:java}
// declare a function
let f = FunctionSignature::UInt64UInt64(Arc::new(Box::new(|a: &u64| Ok(a * 
a))));

// register function
ctx.register_function("pow2", f);

let results = collect(&mut ctx, "SELECT c2, pow2(c2) FROM test")?;
{code}
The important aspect here is that `FunctionSignature::UInt64UInt64` is an enum 
variant of the enum FunctionSignature (this is the design choice that we need 
to discuss).

I took this design decision based on the following hypothesis:
 # there are limited use-cases for UDFs with may arguments
 # there are a limited number of different types in arrow (25ish)

under these hypothesis, we can enumerate all variations on the relevant 
`match`. I am not entirely convinced that this is the correct approach, but it 
is the one that is the most aligned with enumerating all types in match 
currently in the code base and does not resort to dynamic typing.

An alternative design is to use std::any::Any. I can also play around with this 
and see how it feels. Assuming that we can do it, I suspect that the trade-off 
is between run-time and compilation time+binary size.

> [Rust] [DataFusion] Add support for scalar UDFs
> -----------------------------------------------
>
>                 Key: ARROW-6947
>                 URL: https://issues.apache.org/jira/browse/ARROW-6947
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Rust, Rust - DataFusion
>            Reporter: Andy Grove
>            Assignee: Andy Grove
>            Priority: Major
>
> As a user, I would like to be able to define my own functions and then use 
> them in SQL statements.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to