Hello Mura,

You can also take a look at the compute registration example [1], [2] in
the Arrow source code to get an idea how to write a custom function and use
it in C++. Assuming your objective is to add a new function to C++
permanently and use it in Python, you may need to take a look at the
existing Python-bindings.

For instance take a look at the Join operator exposure [3]-[8]. This is not
a full list of references to all the components.
But I hope you can get an idea about exposing a function from C++ to Python.

In addition, if you just need a function to be used in Python
(user-defined-function), you can do everything in Python as well.
For that you may take a look at [9], [10]. Please note that this feature is
experimental and work in progress.

[1]. Compute Example:
https://github.com/apache/arrow/blob/master/cpp/examples/arrow/compute_register_example.cc
[2]. UDF Example:
https://github.com/apache/arrow/blob/master/cpp/examples/arrow/udf_example.cc
[3] Expose JoinOptions(a):
https://github.com/apache/arrow/blob/86915807af6fe10f44bc881e57b2f425f97c56c7/python/pyarrow/includes/libarrow.pxd#L1999
[4] Expose JoinOptions(b):
https://github.com/apache/arrow/blob/86915807af6fe10f44bc881e57b2f425f97c56c7/python/pyarrow/_compute.pyx#L998
[5] Expose JoinType:
https://github.com/apache/arrow/blob/86915807af6fe10f44bc881e57b2f425f97c56c7/python/pyarrow/includes/libarrow.pxd#L2473
[6] Expose HashJoinNodeOptions:
https://github.com/apache/arrow/blob/86915807af6fe10f44bc881e57b2f425f97c56c7/python/pyarrow/includes/libarrow.pxd#L2504
[7] Expose the Join operator in Cython:
https://github.com/apache/arrow/blob/86915807af6fe10f44bc881e57b2f425f97c56c7/python/pyarrow/_exec_plan.pyx#L167
[8] Join Usage in Table:
https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.join
[9] UDF tests:
https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_udf.py
[10] UDF Registration:
https://github.com/apache/arrow/blob/86915807af6fe10f44bc881e57b2f425f97c56c7/python/pyarrow/_compute.pyx#L2504

Best,
Vibhatha

On Fri, Jun 24, 2022 at 3:29 AM Aldrin <akmon...@ucsc.edu.invalid> wrote:

> Hi Mura,
>
> The short answer is: yes, it is possible. Naively, to expose such a
> function something may need to be added in cython. I am learning more about
> how to add new compute functions, so I am not sure yet if functions in the
> function registry are automatically exposed to other languages (e.g. R,
> python, etc.). When I find out more about how this is done, I can follow up
> if this is still unanswered.
>
> There are examples of functions that you can call directly without using
> "CallFunction;" for example, arrow::compute::Add. The second code sample on
> [1] should show how this is done.
>
> [1]: https://arrow.apache.org/docs/cpp/compute.html#invoking-functions
>
> Aldrin Montana
> Computer Science PhD Student
> UC Santa Cruz
>
>
> On Wed, Jun 22, 2022 at 12:34 PM Murali S <muralibala8...@gmail.com>
> wrote:
>
> > Hi ,
> >
> > I was wondering if it is possible to add a C++ Function to the Compute
> > FunctionRegistry
> > <
> >
> https://arrow.apache.org/docs/cpp/api/compute.html#_CPPv4N5arrow7compute16FunctionRegistryE
> > >
> > and
> > then use the functions in python. Would be great if you could provide
> > examples of such usage.
> > Also are all functions added to the FunctionRegistry only callable using
> > the GetFunction API with the function name as string ? Would like to know
> > if there is a way to just do arrow::compute::FuncA where FuncA is the
> newly
> > added function
> >
> > Thanks in advance
> > Mura
> >
>

Reply via email to