hey Patrick, I'd like to help provide some input on this -- both design and implementation -- (and others would, too, I am sure) but we are very loaded down right now working on the 1.0.0 release so it may be a week or two before some of us will be able to dedicate some brain cycles to this.
Thanks, Wes On Thu, Jul 9, 2020 at 3:56 PM Patrick Pai <patrick.m....@gmail.com> wrote: > > I'm working with Steve on this issue. Can you please share what you have in > mind for something more general than Gandiva's serialized expressions? > > I'm currently working through a design. I imagine we will have a FlatBuffer > schema defining all expression types and have the different cpp expression > classes (i.e. ComparisonExpression) act as wrappers around the generated > flatbuf structs. > > I also noticed that the data types used in filters are not backed by > format/Expression.fbs and instead use the types defined in cpp/arrow/type.h > I'm thinking it would be good to make the move to using Expression.fbs so > that the data types themselves are also language independent. I'd appreciate > any feedback or thoughts. > > On 2020/07/06 21:44:40, Wes McKinney <wesmck...@gmail.com> wrote: > > I would also be interested in having a reusable serialized format for > > filter- and projection-like expressions. I think trying to go so far > > as full logical query plans suitable for building a SQL engine is > > perhaps a bit too far but we could start small with the use case from > > the JNI Datasets PR as a motivating example. We should also consider > > replacing or deprecating Gandiva's serialized expressions in favor of > > something more general. > > > > It may be a slight bikeshed issue, but I wouldn't be thrilled about > > having this be based on Protocol Buffers, because of the runtime > > requirement (on libprotobuf.so / libprotobuf.a) it introduces into C++ > > applications. Flatbuffers might be less pleasant developer UX in Java > > but at least in C++ the fact that Flatbuffers results in zero build- > > or runtime dependencies is a significant advantage.