hi Clark, Cool! Before you go too far down the rabbit hole, would you be open to working within an R/ subdirectory in the Arrow codebase? It doesn't have to be ready-to-ship software, and we are happy to set up a branch in the repository for you to experiment so you don't have to worry about bothering the master branch or breaking builds. Otherwise importing your work into the project later will become more complicated and require the Arrow PMC to do some paperwork: http://incubator.apache.org/ip-clearance/ .
I am happy to be available to answer questions on the mailing list, or offline, or discussions in JIRA or on GitHub pull requests. I am sure that Uwe and the other C++ developers will be happy to be available. To get some basics off the ground, the essentials are being able to convert one or more record batches into an R data frame, and back. This is what we did in https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/arrow_to_pandas.h https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/pandas_to_arrow.h We have thin bindings in Cython (which is similar to Rcpp) that make this callable from Python. What Hadley and I put together quickly for Feather last year was effectively a single Arrow record batch converting to and from pandas or R data frames. In Arrow, in practice you may be working with a table in many smaller chunks. Looking forward to getting this off the ground! Thanks, Wes On Thu, Jul 27, 2017 at 7:40 PM, Clark Fitzgerald <clarkfi...@gmail.com> wrote: > I've got at least a "hello world" for R / Arrow bindings in progress. > https://github.com/clarkfitzg/Rarrow > > Over the next couple weeks I plan to spend some time looking at the Arrow > C++ and Python sources and write a few bindings by hand, then think about > how to automatically generate bindings from the C++. Several approaches are > possible, Rffi / rdyncall, Rcpp modules, or RCodegen / RCIndex leveraging > Clang. Not sure which, if any, will work. > > I'm a beginner in C++. It would be very helpful if someone was available to > answer questions on the C++ Arrow codebase, since I'd rather not email the > whole dev list for this. > > Thanks, > Clark