Re: tensorflow-io Arrow Datasets and thoughts on support for tensor columns

2019-03-27 Thread Bryan Cutler
Thanks Wes! I am most interested in the last option, adding Tensor as a logical type, but if it makes sense to embed as a BinaryArray for a first step then that would still be useful too. I'll work on a design doc with a use case and report back. I know there are a lot of different efforts going

Re: tensorflow-io Arrow Datasets and thoughts on support for tensor columns

2019-03-25 Thread Wes McKinney
hi Bryan, I agree this would be useful to work out. There's a few options: * Sending multiple tensors as a sequence of encapsulated IPC messages (as described in https://github.com/apache/arrow/blob/master/docs/source/format/IPC.rst). There is no conflict with the columnar streaming protocol tha

tensorflow-io Arrow Datasets and thoughts on support for tensor columns

2019-03-22 Thread Bryan Cutler
Hi All, Recently I have been working with the TensorFlow SIG-IO community to introduce Apache Arrow based Datasets for bringing Arrow data into TensorFlow. SIG-IO is a community maintained repository focused on input/output support for TF, see https://github.com/tensorflow/io (a lot of formats fro