One objective of the precompiled kernels project is to have meaningful
computational functionality in a package that does not need to include
the LLVM runtime -- to require the LLVM dependency even for simple
functions would more than double the size of our Python packages, for
example.

There is currently little code sharing between functions that do
identical work in arrow::compute:: versus gandiva:: -- this has been
discussed, but it needs a champion to do something about it. When I
was working on the new function framework earlier this year, I spent a
day or so perusing src/gandiva/precompiled and reasoned it would be a
prohibitive amount of refactoring for me to undertake at that time. In
principle many of these functions (e.g. string functions) can be
incrementally refactored into reusable inline functions / templates
for improved code reuse. We could also explore common infrastructure
for unit testing and benchmarking. Anything is possible if enough
engineering time is invested.

I would hope in the future to see a generalized expression API as part
of a logical query plan-type system (for query processing) that has
the ability to use Gandiva (if it's available) to compile
subexpressions for better performance. I had hoped to spend some time
on this myself earlier this year, but I've gotten busy with some other
things and won't be able to devote much development time to this
myself.

- Wes

On Sun, Nov 29, 2020 at 11:18 PM Micah Kornfield <emkornfi...@gmail.com> wrote:
>
> >
> > There are some computations kernels in arrow and it looks that this part is
> > in active development right now. I wonder if there is a document / some
> > emails describing what is the goal and uses cases for this part of the code
> > base. Would be very interesting to know a bit more and I would like to
> > contribute at some point.
>
>
> https://docs.google.com/document/d/1LFk3WRfWGQbJ9uitWwucjiJsZMqLh8lC1vAUOscLtj8/edit
> talks about some of the goals of the compute module.
>
> I'm interested because I develop a Proof-of-concept for a declarative
> > language to perform statistical computations on top of gandiva.
>
>
> I think upon cursory examination someone (maybe Wes) thought Gandiva and
> the compute kernels might not play nicely together, but I can't find a
> reference to that at the moment.
>
>
> On Sat, Nov 21, 2020 at 3:09 AM Kirill Lykov <lykov.kir...@gmail.com> wrote:
>
> > Hi,
> >
> > There are some computations kernels in arrow and it looks that this part is
> > in active development right now. I wonder if there is a document / some
> > emails describing what is the goal and uses cases for this part of the code
> > base. Would be very interesting to know a bit more and I would like to
> > contribute at some point.
> > I'm interested because I develop a Proof-of-concept for a declarative
> > language to perform statistical computations on top of gandiva.
> >
> > --
> > Best regards,
> > Kirill Lykov
> >

Reply via email to