Re: Arrow file formats

2018-04-15 Thread Andy Grove
Hi Wes,

The IPC format looks ideal. Once again my enthusiasm is getting the better
of me. I will start looking into options for implementing this in Rust.

Thanks,

Andy.

On Sun, Apr 15, 2018 at 12:36 PM, Wes McKinney  wrote:

> hi Andy,
>
> Is there a reason to not use the file format defined in
> https://github.com/apache/arrow/blob/master/format/IPC.md#file-format?
> We already have 3 implementations of this format in Java, C++, and
> JavaScript. Is there a way you could wrap the C or C++ Flatbuffers
> headers for use in Rust until the Rust generator is ready for
> primetime? Otherwise there's a lot of wheels to reinvent.
>
> > Somebody on Reddit quite reasonably pointed out that I should look at
> > Feather (which I didn't actually know about until now) and also mentioned
> > that has been deprecated now in favor of some new format in the Arrow
> > project itself?
>
> FYI, I've found there's quite a bit of disinformation (or
> half-information) surrounding this project on the internet. People
> routinely say things to me at conferences and elsewhere that have
> resulted from misconceptions that have been propagated via word of
> mouth or Twitter. For example, I wrote
> http://wesmckinney.com/blog/feather-arrow-future/ in an effort to
> clear up confusion about where the Feather format is going.
>
> - Wes
>
> On Sun, Apr 15, 2018 at 1:35 PM, Andy Grove  wrote:
> > I've started down the path of building a very simple file format for
> > transferring Arrow data between nodes in my project.
> >
> >
> > I'm also aware that the IPC mechanism might potentially be suitable but I
> > haven't had time to read the specs yet. I'm waiting on Google Flatbuffers
> > for Rust though before starting to contribute IPC support and I need
> > something usable in the meantime (and I'm happy to donate whatever I
> build
> > if it is useful).
> >
> > I'd appreciate hearing opinions on where Arrow is going in terms of
> > defining file formats.
> >
> > Thanks,
> >
> > Andy.
>


Re: Arrow file formats

2018-04-15 Thread Wes McKinney
hi Andy,

Is there a reason to not use the file format defined in
https://github.com/apache/arrow/blob/master/format/IPC.md#file-format?
We already have 3 implementations of this format in Java, C++, and
JavaScript. Is there a way you could wrap the C or C++ Flatbuffers
headers for use in Rust until the Rust generator is ready for
primetime? Otherwise there's a lot of wheels to reinvent.

> Somebody on Reddit quite reasonably pointed out that I should look at
> Feather (which I didn't actually know about until now) and also mentioned
> that has been deprecated now in favor of some new format in the Arrow
> project itself?

FYI, I've found there's quite a bit of disinformation (or
half-information) surrounding this project on the internet. People
routinely say things to me at conferences and elsewhere that have
resulted from misconceptions that have been propagated via word of
mouth or Twitter. For example, I wrote
http://wesmckinney.com/blog/feather-arrow-future/ in an effort to
clear up confusion about where the Feather format is going.

- Wes

On Sun, Apr 15, 2018 at 1:35 PM, Andy Grove  wrote:
> I've started down the path of building a very simple file format for
> transferring Arrow data between nodes in my project.
>
>
> I'm also aware that the IPC mechanism might potentially be suitable but I
> haven't had time to read the specs yet. I'm waiting on Google Flatbuffers
> for Rust though before starting to contribute IPC support and I need
> something usable in the meantime (and I'm happy to donate whatever I build
> if it is useful).
>
> I'd appreciate hearing opinions on where Arrow is going in terms of
> defining file formats.
>
> Thanks,
>
> Andy.


Arrow file formats

2018-04-15 Thread Andy Grove
I've started down the path of building a very simple file format for
transferring Arrow data between nodes in my project.

Somebody on Reddit quite reasonably pointed out that I should look at
Feather (which I didn't actually know about until now) and also mentioned
that has been deprecated now in favor of some new format in the Arrow
project itself?

I'm also aware that the IPC mechanism might potentially be suitable but I
haven't had time to read the specs yet. I'm waiting on Google Flatbuffers
for Rust though before starting to contribute IPC support and I need
something usable in the meantime (and I'm happy to donate whatever I build
if it is useful).

I'd appreciate hearing opinions on where Arrow is going in terms of
defining file formats.

Thanks,

Andy.