Le 23/08/2021 à 19:16, Matthew Topol a écrit :
Unfortunately, Go currently can only integrate with C++ libraries through a C interface. There does exist SWIG which is a generator for creating interface code between Go and C++, but ultimately it's just automating the creation of a C interface and Go glue code. Personally I'm not a fan of the code that SWIG generates and haven't had too much luck with it. I have a working POC of using the datasets API via CGO through a C interface (basically just passing around a uintptr_t which is the address of a heap allocated shared_ptr to a DatasetFactory/Dataset/Scanner and using the C Data interface for passing the resulting record batches through without copying), but couldn't decide on the best way to go about integrating the idea and cleaning it up into a real PR, hence this email thread. I initially was porting the Dataset API to Go, but ran into the fact that it uses the compute expression classes to define things and perform the filtering and realized that it wouldn't be a good idea to try porting the entire compute library. So it just becomes a question as to what level I do the implementation and at what level do I make the calls to a C interface to call into the C++, and then whether or not the interface is a separate component from the existing dataset/compute libraries which can get linked into the Go, optionally as a separate module so that it's not creating a dependency on the C++ libraries for the current arrow Go implementation, only for using the Dataset API stuff (and potentially the compute library).
I think the dataset C interface can start as a private module in the Go implementation. If it may be useful to other people then we can consider transferring it into the Arrow C++ source tree.
Regards Antoine.