Hi Pearu, Seems the formatting of your email got messed up a little bit. Can you resend with some more line breaks?
Thanks On Wed, Aug 22, 2018, 4:46 PM Pearu Peterson <pearu.peter...@quansight.com> wrote: > *Hi,The libgdf project defines a column structure that in a simplified form > could be represented astypedef struct { void *data; // > column data unsigned char *valid; // validity mask // one bit per column > item size_t size; // nof items enum {INT8, INT16, > ...} dtype; // type of column item size_t null_count; // nof > non-valid items} my_column_t;The aim is to implement IPC protocol for > sharing my_column_t data between host and GPU devices. What would be the > most sensible way to do that using tools available in Arrow library?We are > currently considering the following approaches:1. Re-using Arrow Array > (C++): my_column_t and Arrow Array have one-to-one correspondence regarding > data content.2. Defining new Arrow format MyColumn (using Arrow Tensor as > an example):table MyColumn { /// The type of data contained in a value > cell. type: Type; /// The number of non-valid items null_count: long; > /// The location and size of the column's data data: Buffer; /// The > location and size of the column's mask valid: Buffer;}We are uncertain > which approach would be easiest to implement and maintain, be efficient > (0-copy), or would make sense at all.Defining Arrow MyColumn seems > appealing because of about 7 times less code in Arrow Tensor than in Arrow > Array. However, Arrow Array includes validity mask already.What do you > think?Best regards,Pearu* >