>
> > > makes it more difficult to bring schema evolution back into the
> > > IPC Stream format (i.e. it would live only in flight)
> >
> > Gosh's proposal extends the flatbuffer structures not the protobufs. Can
> > you help me understand how difficult it would be to bring the `schema_id`
> > approach to the IPC stream format?
>
> I thought we were talking solely about the Flight Protobuf definitions -
> not the Flatbuffers (and the Google doc at least only talks about the
> Protobufs).
>

I somehow missed that schema_id is being added to protobuf in the document.
It feels to me that the schema_id is a property that would ideally only
apply to the RecordBatch. I better understand Micah's dictionary concerns,
now, too.

> Side Question: Why isn't the IPC stream format a series of the flight
> > protobufs? It's a real shame that there is no standard way to
> > capture/replay a stream with app_metadata. (Obviously ignoring the
> > annoyances around protobuf wrapping flatbuffers.)
>
> The IPC format was defined long before Flight, and Flight's app_metadata
> was added after Flight's initial definition. Note an IPC message does have
> a provision for key-value metadata, though I think APIs for that are not
> fully exposed. (See ARROW-6940:
> https://issues.apache.org/jira/browse/ARROW-6940 and despite my comments
> there perhaps we need to unify or at least consider how Flight's
> app_metadata relates to the IPC message custom_metadata. Also perhaps see
> ARROW-1059.)
>

KeyValue unfortunately is string to string. In flatbuffer strings are only
UTF-8 or 7-bit ASCII. The app_metadata on the other hand is opaque bytes.
The latter is a bit more useful.

--

Reply via email to