Thanks! We have already implemented GPU IPC for CUDA:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/gpu/cuda_arrow_ipc.h Is it possible to use these APIs? If not, what could be changed or added to allow you to? I don't think it's worthwhile to maintain an alternative implementation of the IPC protocol in a third party package. The results can be converted to the C data structure that you listed. Wes On Wed, Aug 22, 2018, 4:56 PM Pearu Peterson <pearu.peter...@quansight.com> wrote: > Hi Wes, > > Yes, sorry for the mess. Here is the message in plain text: > > The libgdf project defines a column structure that in a simplified form > could be represented as > > typedef struct { > void *data; // column data > unsigned char *valid; // validity mask, one bit per column > item > size_t size; // nof items > enum {INT8, INT16, ...} dtype; // type of column item > size_t null_count; // nof non-valid items > } my_column_t; > > The aim is to implement IPC protocol for sharing my_column_t data between > host and GPU devices. > > What would be the most sensible way to do that using tools available in > Arrow library? > > We are currently considering the following approaches: > > 1. Re-using Arrow Array: my_column_t and Arrow Array have one-to-one > correspondence regarding data content. > > 2. Defining new Arrow format MyColumn (using Arrow Tensor as an example): > > table MyColumn { > /// The type of data contained in a value cell. > type: Type; > /// The number of non-valid items > null_count: long; > /// The location and size of the column's data > data: Buffer; > /// The location and size of the column's mask > valid: Buffer; > } > > We are uncertain which approach would be easiest to implement and maintain, > be efficient (0-copy), or would make sense at all. > > Defining Arrow MyColumn seems appealing because of about 7 times less code > in Arrow Tensor than in Arrow Array. However, Arrow Array includes validity > mask already. > > What do you think? > > Best regards, > Pearu > > > On Wed, Aug 22, 2018 at 11:53 PM, Wes McKinney <wesmck...@gmail.com> > wrote: > > > Hi Pearu, > > > > Seems the formatting of your email got messed up a little bit. Can you > > resend with some more line breaks? > > > > Thanks > > > > > > On Wed, Aug 22, 2018, 4:46 PM Pearu Peterson < > pearu.peter...@quansight.com > > > > > wrote: > > > > > *Hi,The libgdf project defines a column structure that in a simplified > > form > > > could be represented astypedef struct { void *data; > > // > > > column data unsigned char *valid; // validity mask // one bit per > > column > > > item size_t size; // nof items enum {INT8, INT16, > > > ...} dtype; // type of column item size_t null_count; // > nof > > > non-valid items} my_column_t;The aim is to implement IPC protocol for > > > sharing my_column_t data between host and GPU devices. What would be > the > > > most sensible way to do that using tools available in Arrow library?We > > are > > > currently considering the following approaches:1. Re-using Arrow Array > > > (C++): my_column_t and Arrow Array have one-to-one correspondence > > regarding > > > data content.2. Defining new Arrow format MyColumn (using Arrow Tensor > as > > > an example):table MyColumn { /// The type of data contained in a value > > > cell. type: Type; /// The number of non-valid items null_count: > long; > > > /// The location and size of the column's data data: Buffer; /// The > > > location and size of the column's mask valid: Buffer;}We are uncertain > > > which approach would be easiest to implement and maintain, be efficient > > > (0-copy), or would make sense at all.Defining Arrow MyColumn seems > > > appealing because of about 7 times less code in Arrow Tensor than in > > Arrow > > > Array. However, Arrow Array includes validity mask already.What do you > > > think?Best regards,Pearu* > > > > > >