One objective of the precompiled kernels project is to have meaningful computational functionality in a package that does not need to include the LLVM runtime -- to require the LLVM dependency even for simple functions would more than double the size of our Python packages, for example.
There is currently little code sharing between functions that do identical work in arrow::compute:: versus gandiva:: -- this has been discussed, but it needs a champion to do something about it. When I was working on the new function framework earlier this year, I spent a day or so perusing src/gandiva/precompiled and reasoned it would be a prohibitive amount of refactoring for me to undertake at that time. In principle many of these functions (e.g. string functions) can be incrementally refactored into reusable inline functions / templates for improved code reuse. We could also explore common infrastructure for unit testing and benchmarking. Anything is possible if enough engineering time is invested. I would hope in the future to see a generalized expression API as part of a logical query plan-type system (for query processing) that has the ability to use Gandiva (if it's available) to compile subexpressions for better performance. I had hoped to spend some time on this myself earlier this year, but I've gotten busy with some other things and won't be able to devote much development time to this myself. - Wes On Sun, Nov 29, 2020 at 11:18 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > > There are some computations kernels in arrow and it looks that this part is > > in active development right now. I wonder if there is a document / some > > emails describing what is the goal and uses cases for this part of the code > > base. Would be very interesting to know a bit more and I would like to > > contribute at some point. > > > https://docs.google.com/document/d/1LFk3WRfWGQbJ9uitWwucjiJsZMqLh8lC1vAUOscLtj8/edit > talks about some of the goals of the compute module. > > I'm interested because I develop a Proof-of-concept for a declarative > > language to perform statistical computations on top of gandiva. > > > I think upon cursory examination someone (maybe Wes) thought Gandiva and > the compute kernels might not play nicely together, but I can't find a > reference to that at the moment. > > > On Sat, Nov 21, 2020 at 3:09 AM Kirill Lykov <lykov.kir...@gmail.com> wrote: > > > Hi, > > > > There are some computations kernels in arrow and it looks that this part is > > in active development right now. I wonder if there is a document / some > > emails describing what is the goal and uses cases for this part of the code > > base. Would be very interesting to know a bit more and I would like to > > contribute at some point. > > I'm interested because I develop a Proof-of-concept for a declarative > > language to perform statistical computations on top of gandiva. > > > > -- > > Best regards, > > Kirill Lykov > >