This is really exciting, thanks a lot for sharing! In case anybody wants to try this out from Python, I wrote up some Cython bindings (very limited so far, but they can already be used to construct some computation graphs and do some benchmarks): https://github.com/apache/arrow/pull/2153
They are developed in the Arrow repo for now, it would be great if we could find a good solution to integrate the two projects and build systems seamlessly (for example setting up a Cython environment in the Gandiva repo in a way that interoperates well with PyArrow would be hard right now). -- Philipp. On Thu, Jun 21, 2018 at 4:26 PM, Wes McKinney <wesmck...@gmail.com> wrote: > hi Jacques, > > This is very exciting! LLVM codegen for Arrow has been on my wishlist > since the early days of the project. I always considered it more of a > "when" question more than "if". > > I will take a closer look at the codebase to make some comments, but > my biggest initial question is whether we could work to make Gandiva > the official community-supported LLVM framework for creating > JIT-compiled Arrow kernels. In the Ursa Labs (a new lab I am building > to focus 90+% on Apache Arrow development) tech roadmap we discussed > the need for a subgraph compiler using LLVM: > https://ursalabs.org/tech/#subgraph-compilation-code-generation. > > I would be interesting in getting involved in the project, and I > expect in time many others will, as well. An obvious question would be > whether you would be interested in donating the project to Apache > Arrow and continuing the work there. We would benefit from common > build, testing/CI, and packaging/deployment infrastructure. I'm keen > to see JIT-powered predicate pushdown in Parquet files, for example. > Phillip and I could look into building a Gandiva backend for compiling > a subset of expressions originating from Ibis, a lazy-evaluation DSL > system with similar API to pandas > (https://github.com/ibis-project/ibis). > > best > Wes > > On Thu, Jun 21, 2018 at 4:13 PM, Dimitri Vorona > <alen...@googlemail.com.invalid> wrote: > > Hey Jaques, > > > > Great stuff! I'm actually researching the integration of arrow and flight > > into a main memory database which also uses LLVM for dynamic query > > generation! Excited to have a more detailed look at Gandiva! > > > > Cheers, > > Dimitri. > > > > On Thu, Jun 21, 2018, 21:15 Jacques Nadeau <jacq...@apache.org> wrote: > > > >> Hey Guys, > >> > >> Dremio just open sourced a new framework for processing data in Arrow > data > >> structures [1], built on top of the Apache Arrow C++ APIs and leveraging > >> LLVM (Apache licensed). It also includes Java APIs that leverage the > Apache > >> Arrow Java libraries. I expect the developers who have been working on > this > >> will introduce themselves soon. To read more about it, take a look at > our > >> Ravindra's blog post (he's the lead developer driving this work): [2]. > >> Hopefully people will find this interesting/useful. > >> > >> Let us know what you all think! > >> > >> thanks, > >> Jacques > >> > >> > >> [1] https://github.com/dremio/gandiva > >> [2] https://www.dremio.com/announcing-gandiva-initiative- > for-apache-arrow/ > >> >