Thanks for clarifying! And yes, my bad on the typo. I meant to say format/Schema.fbs
On 2020/07/10 04:27:50, Micah Kornfield <emkornfi...@gmail.com> wrote: > Hi Patrick, > > > I'm working with Steve on this issue. Can you please share what you have > > in mind for something more general than Gandiva's serialized expressions? > > Not necessarily something "more" general, but we should ensure that the > approach taken should be capable of representing the same information as > Gandiva, so we can ultimately try to ensure convergence between the two. > > I'm currently working through a design. I imagine we will have a FlatBuffer > > schema defining all expression types and have the different cpp expression > > classes (i.e. ComparisonExpression) act as wrappers around the generated > > flatbuf structs. > > This sounds like the general approach that is taken for Schema.fbs, so > probably a reasonable place to start. As Wes said, there will probably be > a lot of input in this area, but having a concrete proposal would help > guide the conversation. > > I also noticed that the data types used in filters are not backed by > > format/Expression.fbs and instead use the types defined in cpp/arrow/type.h > > Do you mean Schema.fbs? > > Thanks, > Micah > > On Thu, Jul 9, 2020 at 1:56 PM Patrick Pai <patrick.m....@gmail.com> wrote: > > > I'm working with Steve on this issue. Can you please share what you have > > in mind for something more general than Gandiva's serialized expressions? > > > > I'm currently working through a design. I imagine we will have a > > FlatBuffer schema defining all expression types and have the different cpp > > expression classes (i.e. ComparisonExpression) act as wrappers around the > > generated flatbuf structs. > > > > I also noticed that the data types used in filters are not backed by > > format/Expression.fbs and instead use the types defined in cpp/arrow/type.h > > I'm thinking it would be good to make the move to using Expression.fbs so > > that the data types themselves are also language independent. I'd > > appreciate any feedback or thoughts. > > > > On 2020/07/06 21:44:40, Wes McKinney <wesmck...@gmail.com> wrote: > > > I would also be interested in having a reusable serialized format for > > > filter- and projection-like expressions. I think trying to go so far > > > as full logical query plans suitable for building a SQL engine is > > > perhaps a bit too far but we could start small with the use case from > > > the JNI Datasets PR as a motivating example. We should also consider > > > replacing or deprecating Gandiva's serialized expressions in favor of > > > something more general. > > > > > > It may be a slight bikeshed issue, but I wouldn't be thrilled about > > > having this be based on Protocol Buffers, because of the runtime > > > requirement (on libprotobuf.so / libprotobuf.a) it introduces into C++ > > > applications. Flatbuffers might be less pleasant developer UX in Java > > > but at least in C++ the fact that Flatbuffers results in zero build- > > > or runtime dependencies is a significant advantage. > > >