Per my other email, you can generate JSON that is canonical protobuf if you don't want to pull the protobuf dependency. In terms of field typing: I could see treating that as optional in a user expression that is resolved later.
On Fri, Jul 24, 2020 at 12:47 PM Patrick Pai <p...@drwholdings.com> wrote: > I only briefly looked into the Gandiva protobuf, but one issue seems to be > using protobuf (Wes is against this for dependency reasons). There's also > some inconsistencies between the Gandiva protobuf and how filter > expressions should be represented, i.e. in the Gandiva protobuf fields are > typed when I think fields should just contain a field name. > > -----Original Message----- > From: Jacques Nadeau <jacq...@apache.org> > Sent: Thursday, July 23, 2020 10:14 PM > To: dev <dev@arrow.apache.org> > Subject: [ext] Re: language independent representation of filter > expressions > > Have you tried to use the existing expression representation provided by > Gandiva? What are the issues you've seen with it? > > On Wed, Jul 22, 2020 at 10:24 AM Patrick Pai <p...@drwholdings.com> wrote: > > > Hi all, > > > > After some discussion with Steve, we'd like to propose and get > > feedback on an alternative to representing expressions entirely with > flatbuffers. > > > > To give some context, we thought about how we'd construct flatbuffer > > expressions in Java or another language if we went down that route. We > > realized that it'd be possible, but not user friendly. An example is > > specifying an array of int values in Java for an InExpression. In > > Java, we'd ideally have some user-friendly class (i.e. arrow's > > IntVector) that then gets converted to the appropriate flatbuffer > > representation. I think this is what Jacques was saying about language > > support being too weak - it's possible for Java users to construct a > > flatbuffer expression, but not easily without an additional conversion > layer for every language. > > > > An alternative we're thinking about is to only represent enum values > (i.e. > > those defined in arrow::dataset::ExpressionType::type) in a flatbuffer > > schema, and rely on the existing IPC format (used to > > serialize/deserialize cpp expressions) to pass the struct array > > representation of an expression from for example Java to C++. The one > > difference is in the struct array representation, we use the enum > > values defined in our flatbuffer schema instead of existing cpp enums. > > This approach requires us on the Java side (and languages other than > > C++) to construct the struct array, but the benefit is minimal changes > > to the C++ code (the main change being using our flatbuffer schema enum > values). > > > > > > On 2020/07/13 09:21:19, Antoine Pitrou <solip...@pitrou.net> wrote: > > > On Sat, 11 Jul 2020 09:55:16 -0700 > > > Jacques Nadeau <jacq...@apache.org> wrote: > > > > > > > > I'm against extending use of flatbuf within Arrow. The language > > support is > > > > too weak. Language support isn't just about having a binding for > > different > > > > languages, it is about having a high-quality binding. > > > > > > Could you please expand on this? ("the language support is too > > > weak") > > > > > > Thank you > > > > > > Antoine. > > > > > > > > > > > > > This e-mail and any attachments may contain information that is > > confidential and proprietary and otherwise protected from disclosure. > > If you are not the intended recipient of this e-mail, do not read, > > duplicate or redistribute it by any means. Please immediately delete > > it and any attachments and notify the sender that you have received it > by mistake. > > Unintended recipients are prohibited from taking action on the basis > > of information in this e-mail or any attachments. The DRW Companies > > make no representations that this e-mail or any attachments are free > > of computer viruses or other defects. > > > This e-mail and any attachments may contain information that is > confidential and proprietary and otherwise protected from disclosure. If > you are not the intended recipient of this e-mail, do not read, duplicate > or redistribute it by any means. Please immediately delete it and any > attachments and notify the sender that you have received it by mistake. > Unintended recipients are prohibited from taking action on the basis of > information in this e-mail or any attachments. The DRW Companies make no > representations that this e-mail or any attachments are free of computer > viruses or other defects. >