[
https://issues.apache.org/jira/browse/ARROW-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033194#comment-17033194
]
Jorge commented on ARROW-6947:
------------------------------
I finished a first draft of this that I am minimally happy with, see
[https://github.com/jorgecarleitao/arrow/commit/0c77ff531cee191712d08633080de9caed786b62]
. Not all function signatures are implemented on purpose, as I see we need to
take a design decision with respect to them.
The API is used as shown in [this
test|https://github.com/jorgecarleitao/arrow/commit/0c77ff531cee191712d08633080de9caed786b62#diff-8273e76b6910baa123f3a25a967af3b5R572]:
{code:java}
// declare a function
let f = FunctionSignature::UInt64UInt64(Arc::new(Box::new(|a: &u64| Ok(a *
a))));
// register function
ctx.register_function("pow2", f);
let results = collect(&mut ctx, "SELECT c2, pow2(c2) FROM test")?;
{code}
The important aspect here is that `FunctionSignature::UInt64UInt64` is an enum
variant of the enum FunctionSignature (this is the design choice that we need
to discuss).
I took this design decision based on the following hypothesis:
# there are limited use-cases for UDFs with may arguments
# there are a limited number of different types in arrow (25ish)
under these hypothesis, we can enumerate all variations on the relevant
`match`. I am not entirely convinced that this is the correct approach, but it
is the one that is the most aligned with enumerating all types in match
currently in the code base and does not resort to dynamic typing.
An alternative design is to use std::any::Any. I can also play around with this
and see how it feels. Assuming that we can do it, I suspect that the trade-off
is between run-time and compilation time+binary size.
> [Rust] [DataFusion] Add support for scalar UDFs
> -----------------------------------------------
>
> Key: ARROW-6947
> URL: https://issues.apache.org/jira/browse/ARROW-6947
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Rust, Rust - DataFusion
> Reporter: Andy Grove
> Assignee: Andy Grove
> Priority: Major
>
> As a user, I would like to be able to define my own functions and then use
> them in SQL statements.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)