On Fri, Jun 28, 2024 at 11:07 AM Andrew Lamb <al...@influxdata.com> wrote:
>
> Hi Xuanwo,
>
> Sorry for the delay in responding. I think  the ability to easily write
> functions that "feel" like native functions in whatever language and be
> able to generate arrow / vectorized versions of them is quite valuable.
> This is my understanding of what this proposal is about.

My understanding is that it's not vectorized. From the examples in
risingwavelabs/arrow-udf, <https://github.com/risingwavelabs/arrow-udf> it
looks like the macros generate code that gathers values from columns into
local scalars that are passed as scalar parameters to user functions. Is
the hope here that rustc/llvm will auto-vectorize the code?

#[function("gcd(int, int) -> int")]
fn gcd(mut a: i32, mut b: i32) -> i32 {
    while b != 0 {
        (a, b) = (b, a % b);
    }
    a
}

#[function("div(int, int) -> int")]
fn div(x: i32, y: i32) -> Result<i32, &'static str> {
    if y == 0 {
        return Err("division by zero");
    }
    Ok(x / y)
}

> I left some additional comments on the markdown.
>
> One thing that might be worth doing is articulate some other potential
> locations for where the code might go. One option, as I think you propose,
> is to make its own repository.  Another option could be to donate the code
> and put the various language bindings in the same repo as the arrow
> language implementations (e.g arrow-rs, arrow for python, etc) which would
> likely make it easier to maintain and discover.
>
> I am curious about what other devs / users feel about this?
>
> Andrew
>
>
>
> On Thu, Jun 20, 2024 at 3:04 AM Xuanwo <xua...@apache.org> wrote:
>
> > Hello, everyone.
> >
> > I start this thread to disscuss the donation of a User-Defined Function
> > Framework for Apache Arrow.
> >
> > Feel free to review and leave your comments here. For live review,
please
> > visit:
> >
> > https://hackmd.io/@xuanwo/apache-arrow-udf
> >
> > The original content also pasted here for a quick reading:
> >
> > ------
> >
> > ## Abstract
> >
> > Arrow UDF is a User-Defined Function Framework for Apache Arrow.
> >
> > ## Proposal
> >
> > Arrow UDF allows user to easily create and run user-defined functions
> > (UDF) in Rust, Python, Java or JavaScript based on Apache Arrow. The
> > functions can be executed natively, or in WebAssembly, or in a remote
> > server via Arrow Flight.
> >
> > Arrow UDF was originally designed to be used by the RisingWave project
but
> > is now being used by Databend and several database startups.
> >
> > We believe that the Arrow UDF project will provide diversity value to
the
> > entire Arrow community.
> >
> > ## Background
> >
> > Arrow UDF is being developed by an open-source community from day one
and
> > is owned by RisingWaveLabs. The project has been launched in December
2023.
> >
> > ## Initial Goals
> >
> > By transferring ownership of the project to the Apache Arrow, Arrow UDF
> > expects to ensure its neutrality and further encourage and facilitate
the
> > adoption of Arrow UDF by the community.
> >
> > ## Current Status
> >
> > Contributors: 5
> >
> > Users:
> >
> > -   [RisingWave]: A Distributed SQL Database for Stream Processing.
> > -   [Databend]: An open-source cloud data warehouse that serves as a
> > cost-effective alternative to Snowflake.
> >
> > ## Documentation
> >
> > The document of Arrow UDF is hosted at
> > https://docs.rs/arrow-udf/latest/arrow_udf/.
> >
> > ## Initial Source
> >
> > The project currently holds a GitHub repository and multiple packages:
> >
> > - https://github.com/risingwavelabs/arrow-udf
> >
> > Rust:
> >
> > - https://crates.io/arrow-udf/
> > - https://crates.io/arrow-udf-python/
> > - https://crates.io/arrow-udf-js/
> > - https://crates.io/arrow-udf-js-deno/
> > - https://crates.io/arrow-udf-wasm/
> >
> > Python:
> >
> > - https://pypi.org/project/arrow-udf/
> >
> > Those packge will retain its name, while the repository will be moved to
> > apache org.
> >
> > ## Required Resources
> >
> > ### Mailing Lists
> >
> > We can reuse the existing mailing lists that arrow have.
> >
> > ### Git Repositories
> >
> > From
> >
> > - https://github.com/risingwavelabs/arrow-udf
> >
> > To
> >
> > - https://gitbox.apache.org/asf/repos/arrow-udf
> > - https://github.com/apache/arrow-udf
> >
> > ### Issue Tracking
> >
> > The project would like to continue using GitHub Issues.
> >
> > ### Other Resources
> >
> > The project has already chosen GitHub actions as continuous integration
> > tools.
> >
> > ## Initial Committers
> >
> > - Runji Wang wangrunji0...@163.com
> > - Giovanny Gutiérrez
> > - sundy-li sund...@apache.org
> > - Xuanwo xua...@apache.org
> > - Max Justus Spransy maxjus...@gmail.com
> >
> > [RisingWave]: https://github.com/risingwavelabs/risingwave
> > [Databend]: https://github.com/datafuselabs/databend
> >
> > Xuanwo
> >

Reply via email to