Re: [VOTE][Format] Clarify allowed value range for the Time types

2021-08-23 Thread Fan Liya
+1 On Fri, Aug 20, 2021 at 11:37 PM Micah Kornfield wrote: > +1 (binding) > > On Fri, Aug 20, 2021 at 7:46 AM Keith Kraus > wrote: > > > +1 (non-binding) > > > > On Fri, Aug 20, 2021 at 9:49 AM Rok Mihevc wrote: > > > > > +1 (non-binding) > > > > > > On Fri, Aug 20, 2021 at 3:46 PM Jorge

RE: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Matthew Topol
That's precisely what I was suggesting and expecting. Sounds good. I'll clean up my POC and make a proper PR sometime soon. Thanks much for the discussion! --Matt -Original Message- From: Antoine Pitrou Sent: Monday, August 23, 2021 2:18 PM To: dev@arrow.apache.org Subject: Re:

Re: [DISCUSS] Binary Values in Key value pairs WAS: Re: [INFO_REQUEST][FLIGHT] - Dynamic schema changes in ArrowFlight streams

2021-08-23 Thread David Li
I believe so. The encoding of a string in Flatbuffers is [byte] with a null terminator not included in the length, so old files should still be readable (they would simply not see the terminator anymore). And conversely, continuing to write the null terminator means new files should still be

Re: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Antoine Pitrou
Then we could provide a small C dataset API somewhere in the C++ source tree (perhaps `arrow/dataset/c/api.h`?). It would be unstable/experimental and could undergo changes or even removal without notice. Regards Antoine. Le 23/08/2021 à 20:07, Matthew Topol a écrit : Because go is

RE: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Matthew Topol
Because go is always statically compiled for whatever platform you're on at the time, the default behavior is for importing go libraries using `go get` from the command line actually does a git clone of the code and compiles it on the fly (because go's compiler is pretty darn fast) and caches

Re: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Antoine Pitrou
Le 23/08/2021 à 19:53, Matthew Topol a écrit : The only thing I don't like it being a private module in the Go implementation is distribution. For native go code, consumers can just perform `go get` and have it work. But for this interface, it would require both consumers of the module and

RE: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Matthew Topol
The only thing I don't like it being a private module in the Go implementation is distribution. For native go code, consumers can just perform `go get` and have it work. But for this interface, it would require both consumers of the module and any consumers of those consumers to have a local

Re: [DISCUSS] Developing an "Arrow Compute IR [Intermediate Representation]" to decouple language front ends from Arrow-native compute engines

2021-08-23 Thread Jacques Nadeau
In a lucky turn of events, Phillip actually turned out to be in my neck of the woods on Friday so we had a chance to sit down and discuss this. To help, I actually shared something I had been working on a few months ago independently (before this discussion started). For reference: Wes PR:

Re: [DISCUSS] Binary Values in Key value pairs WAS: Re: [INFO_REQUEST][FLIGHT] - Dynamic schema changes in ArrowFlight streams

2021-08-23 Thread Antoine Pitrou
Le 23/08/2021 à 17:52, David Li a écrit : Another way forward might be to relax the value type to [byte], but also require implementations to null-terminate binary values regardless. The C++ Flatbuffers implementation does this already [1] (though not the Java one [2]). Old implementations

Re: Flight SQL

2021-08-23 Thread Kyle Porter
We're just going into the C++ implementation now - would having the C++ client be enough here or are we looking for both the client and the server side? *Kyle Porter* CEO Bit Quill Technologies Inc. Office: +1.778.331.3355 | Direct: +1.604.441.7318 | ky...@bitquilltech.com

[RUST] New Metrics API Proposal

2021-08-23 Thread Andrew Lamb
I would like to point out a PR [1] that proposes a new API for recording and aggregating execution metrics (visible via `EXPLAIN ANALYZE` and programmatically) in case anyone would like to offer feedback for the design [1] https://github.com/apache/arrow-datafusion/pull/908

Re: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Antoine Pitrou
Le 23/08/2021 à 19:16, Matthew Topol a écrit : Unfortunately, Go currently can only integrate with C++ libraries through a C interface. There does exist SWIG which is a generator for creating interface code between Go and C++, but ultimately it's just automating the creation of a C

RE: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Matthew Topol
Unfortunately, Go currently can only integrate with C++ libraries through a C interface. There does exist SWIG which is a generator for creating interface code between Go and C++, but ultimately it's just automating the creation of a C interface and Go glue code. Personally I'm not a fan of the

Re: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Antoine Pitrou
Le 23/08/2021 à 18:22, Matthew Topol a écrit : That's a fair point, and part of the work I've done so far is a local Go implementation of at least consuming the C data interface. It will also eventually involve creating the necessary implementation to produce the C-Data interface too. But

RE: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Matthew Topol
That's a fair point, and part of the work I've done so far is a local Go implementation of at least consuming the C data interface. It will also eventually involve creating the necessary implementation to produce the C-Data interface too. But specifically I'm asking for opinions on using that

Re: Flight SQL

2021-08-23 Thread David Li
If it's just the Protobuf it would at least not generate any code, but at that point it would probably be better to just have Kyle & co. copy-paste the file into the C++ PR until we can get it all settled. -David On Sun, Aug 22, 2021, at 17:53, Micah Kornfield wrote: > In the interest of

Re: [C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Antoine Pitrou
Hi Matt, As the name suggests, the C data interface is not a *programming* interface. It is a data sharing convention which relies on the existence of dedicated endpoints to produce or consume the C data structures. For example in Arrow C++, there is this set of APIs:

Re: [DISCUSS] Binary Values in Key value pairs WAS: Re: [INFO_REQUEST][FLIGHT] - Dynamic schema changes in ArrowFlight streams

2021-08-23 Thread David Li
Another way forward might be to relax the value type to [byte], but also require implementations to null-terminate binary values regardless. The C++ Flatbuffers implementation does this already [1] (though not the Java one [2]). Old implementations validating UTF8-ness would still be unable to

[C++][Go] CGO For Dataset API Integration

2021-08-23 Thread Matthew Topol
Hey All, So I've been working on a use case where I needed to be able to use the Dataset API from Golang and instead of trying to port all of it to Golang (which would require porting the Compute side too) I decided to create a proof of concept using CGO to just call into the existing C++ code

Re: Review request for Dataset Java API PRs

2021-08-23 Thread Hongze Zhang
Hello guys, sorry I have to to request for review here again since progress didn't seem to be made yet :(. These PRs are important to Dataset Java implementation as the first version of it was too basic to serve advanced use cases. If reviewing of the big write support PR[1] sounds to be a bit of