Hi Neil Richardson, I apologize for the late reply. The links are pretty helpful, thanks a ton! I went through them and this would be a very good starting point for a larger project that I am working on where my task is exactly this. Conversions "to Arrow" and "from Arrow".
On 2020/03/29 20:40:59, Neal Richardson <neal.p.richard...@gmail.com> wrote: > Hi Anish, > You may be interested in how the Arrow R package uses the C interface to > pass data to/from pyarrow. Both sides use the Arrow C++ library's > implementation of the C interface. See > https://github.com/apache/arrow/blob/master/r/src/py-to-r.cpp and > https://github.com/apache/arrow/blob/master/r/R/py-to-r.R. The Arrow C++ > implementation is in > https://github.com/apache/arrow/tree/master/cpp/src/arrow/c. > > Neal > > On Sun, Mar 29, 2020 at 12:14 PM Anish Biswas <anishbiswas...@gmail.com> > wrote: > > > I have been trying to wrap my head around the[ CDataInterface.rst| > > > > https://github.com/apache/arrow/blob/master/docs/source/format/CDataInterface.rst > > ] > > document for a few days now. So what I am trying is basically to use the C > > interface with a minimum dependencies to produce blocks of bytes that > > pyarrow can reconstruct and work on as a normal pyarrow array (and > > vice-versa: both directions). > > > > Here's what I already tried doing. > > > > - Created a C library that contains the two structs ArrowSchema and > > ArrowArray and some functions to export an int64_t array as an Arrow > > Array. > > This is very similar to what the document did with int32_t arrays. > > - Imported the C library in Python. Created an int64_t pyarrow.array. > > Serialized it to read the bytes via Numpy and populated the C struct I > > created using the C library function. > > > > What I expected was that the bytes would have some resemblance to each > > other and that pyarrow would have some utility to pick up the ArrowArray > > struct and treat it as an Arrow Array. But I couldn't get it to work. > > > > I am also confused as to how do I use ArrowSchema properly. The > > ArrowSchema is > > the only structure that differentiates different ArrowArray formats. > > However, the fact that I am not using it anywhere with the ArrowArray > > struct > > or for that matter for any kind of initialization which tells the Arrow > > library that "The next structure you will encounter would be of the kind > > that the ArrowSchema has provided you", doesn't seem correct to me. > > > > It would really help me out, if you could tell if I actually misinterpreted > > the doc, or am I doing something wrong. Thanks! > > >