Thanks for writing this.

I agree. That is a good decision tree. +1

Best,
Jorge


On Thu, Apr 29, 2021 at 6:08 PM Micah Kornfield <emkornfi...@gmail.com>
wrote:

> The discussion around adding another interval type to the Schema.fbs raises
> the issue of when do we decide to add a new type to the Schema.fbs vs using
> other means (primarily extension types [1]).
>
> A few criteria come to mind that could help decide (feedback welcome):
>
> 1.  Is the type a new parameterization of an existing type?
>     - If Yes, and we believe the parameterization is useful and can be done
> in a forward/backward compatible manner then we would update Schema.fbs.
>
> 2.  Does the type itself have its own specification for processing (e.g.
> JSON, BSON, Thrift, Avro, Protobuf)?
>   - If yes, we would NOT add them to Schema.fbs.  I think this would
> potentially yield too many new types.
>
> 3.  Is the underlying encoding of the type already semantically supported
> by a type? (e.g. if we want to encode physical lengths like meters these
> can be represented by an integer).
>    - If yes, we would NOT update the specification.  This seems like the
> exact use-case that extension types are meant for.
>
> * How does this apply to Interval? *
> Interval extends an existing type in the specification and multiple "packed
> fields" cannot be easily communicated with the current version of the
> specification.  Hence, I feel comfortable making the addition to Schema.fbs
>
> * What does this mean for other common types? *
>
> I think as types come up that are very common but we don't want to add to
> the Schema.fbs we should invest in formalizing them as "Well Known"
> Extension types.  In this scenario, we would update the specification to
> include how to specify the extension type metadata (and still require at
> least two libraries support the Extension type before inclusion as "Well
> Known").
>
> * Practical implications *
>
> I think this means the type system in Schema.fbs is mostly closed (i.e.
> there is a high bar for adding new types). One potentially useful type to
> have would be a "packed struct" that supports something similar to python
> struct library [2].  I think this would likely cover many extension type
> use-cases.
>
> Thoughts?
>
> -Micah
>
> [1] https://arrow.apache.org/docs/format/Columnar.html#extension-types
> [2] https://docs.python.org/3/library/struct.html
>

Reply via email to