Re: Re: Human-readable version of Arrow Schema

2020-05-08 Thread Christian Hudon
quot;m2": "meta 2", > "m3": "meta 3" > }, > "children": [] > } > ], > "metadata":

Aw: Re: Human-readable version of Arrow Schema

2020-05-07 Thread hans-joachim . bothe
quot;: "meta 3" }, "children": [] } ], "metadata": { "m1": "meta 1", "m2": &

Re: Human-readable version of Arrow Schema?

2020-05-05 Thread Christian Hudon
Hi folks! I'm back. Yes to François's comments. This has to be something that is readable by data scientists, researchers, etc. without having the doc side-by-side, which is definitely not the case for the C-interface representation. I've created a draft pull request with code that's definitely

Re: Human-readable version of Arrow Schema?

2020-01-09 Thread Francois Saint-Jacques
The desired goal for this feature is trivial modifications, e.g. within an editor, by data-scientists and researchers. I'd go for the flatbuffer's json representation as it is stable and has native support in almost any language or editor due to the ubiquity of JSON. The C interface schema string

Re: Human-readable version of Arrow Schema?

2020-01-08 Thread Kohei KaiGai
Hello, pg2arrow [*1] has '--dump' mode to print out schema definition of the given Apache Arrow file. Does it make sense for you? $ ./pg2arrow --dump ~/hoge.arrow [Footer] {Footer: version=V4, schema={Schema: endianness=little, fields=[{Field: name="id", nullable=true, type={Int32}, children=[],

Re: Human-readable version of Arrow Schema?

2020-01-08 Thread Micah Kornfield
The C-interface representation is probably slightly less readable then the JSON implementation if I understand the flatbuffer to JSON conversion properly. But as Antoine pointed out it depends on the use-case. FWIW, flatbuffers maintainers indicated forward/backward compatibility is intended to

Re: Human-readable version of Arrow Schema?

2020-01-04 Thread Antoine Pitrou
Le 04/01/2020 à 23:17, Jacques Nadeau a écrit : > I guess we'd still need to introduce a way to nest, it only has type > representation. Right. Before exploring this direction more in depth, I think it would be useful to know what the intended use case is. Perhaps the OP (Christian Hudon)

Re: Human-readable version of Arrow Schema?

2020-01-04 Thread Jacques Nadeau
I guess we'd still need to introduce a way to nest, it only has type representation. On Sat, Jan 4, 2020 at 2:16 PM Jacques Nadeau wrote: > What do people think about using the C interface representation? > > On Sun, Dec 29, 2019 at 12:42 PM Micah Kornfield > wrote: > >> I opened

Re: Human-readable version of Arrow Schema?

2020-01-04 Thread Jacques Nadeau
What do people think about using the C interface representation? On Sun, Dec 29, 2019 at 12:42 PM Micah Kornfield wrote: > I opened https://github.com/google/flatbuffers/issues/5688 to try to get > some clarity. > > On Tue, Dec 24, 2019 at 12:13 PM Wes McKinney wrote: > > > On Tue, Dec 24,

Re: Human-readable version of Arrow Schema?

2019-12-29 Thread Micah Kornfield
I opened https://github.com/google/flatbuffers/issues/5688 to try to get some clarity. On Tue, Dec 24, 2019 at 12:13 PM Wes McKinney wrote: > On Tue, Dec 24, 2019 at 2:47 AM Micah Kornfield > wrote: > >> > >> If we were to make the same kinds of forward/backward compatibility > >> guarantees

Re: Human-readable version of Arrow Schema?

2019-12-23 Thread Micah Kornfield
> > If we were to make the same kinds of forward/backward compatibility > guarantees as with Flatbuffers it could create a lot of work for > maintainers. Does it pay to follow-up with the flatbuffer project to understand if the forward/backward compatibility guarantees the flatbuffers provide

Re: Human-readable version of Arrow Schema?

2019-12-11 Thread Micah Kornfield
> > With these two together, it would seem not too difficult to create a text > representation for Arrow schemas that (at some point) has some > compatibility guarantees, but maybe I'm missing something? I think the main risk is if somehow flatbuffers JSON parsing doesn't handle backward

Re: Human-readable version of Arrow Schema?

2019-12-10 Thread Christian Hudon
Micah: I didn't know that Flatbuffers supported serialization to/from JSON, thanks. That seems like a very good start, at least. I'll aim to create a draft pull request that at least wires everything up in Arrow so we can load/save a Schema.fbs instance from/to JSON. At least it'll make it easier

Re: Human-readable version of Arrow Schema?

2019-12-09 Thread Wes McKinney
The only "canonical" representation of schemas at the moment is the Flatbuffers data structure [1] Having a human-readable/parseable text representation I think only makes sense if it is offered without any backward/forward compatibility guarantees. Note I had previously opened

Re: Human-readable version of Arrow Schema?

2019-12-07 Thread Maarten Ballintijn
Is there a syntax specified for schemas? Cheers, Maarten. > On Dec 6, 2019, at 5:01 PM, Micah Kornfield wrote: > > Hi Christian, > As far as I know no-one is working on a canonical text representation for > schemas. A JSON serializer exists for integration test purposes, but > IMO it

Re: Human-readable version of Arrow Schema?

2019-12-06 Thread Micah Kornfield
Hi Christian, As far as I know no-one is working on a canonical text representation for schemas. A JSON serializer exists for integration test purposes, but IMO it shouldn't be relied upon as canonical. It looks like Flatbuffers supports serialization to/from JSON [1

Human-readable version of Arrow Schema?

2019-12-06 Thread Christian Hudon
Hi, For the uses I would like to make of Arrow, I would need a human-readable and -writable version of an Arrow Schema, that could be converted to and from the Arrow Schema C++ object. Going through the doc for 0.15.1, I don't see anything to that effect, with the closest being the ToString()