On Mon, Nov 25, 2019 at 8:52 AM Antoine Pitrou <anto...@python.org> wrote:
>
>
> Hello,
>
> The spec has the following language about union type ids:
> """
> Types buffer: A buffer of 8-bit signed integers. Each type in the union
> has a corresponding type id whose values are found in this buffer. A
> union with more than 127 possible types can be modeled as a union of unions.
> """
> https://arrow.apache.org/docs/format/Columnar.html#union-layout
>
> However, in several places the C++ code assumes type ids are unsigned.
> Java doesn't seem to implement type ids (and there is no integration
> task for union types).
>
> In the flatbuffers description, the type ids array is modeled as an
> array of signed 32-bit integers.
>
> Moreover, according to the language above, type ids should be restricted
> to the [0, 127] interval?  Which one should it be?

The (optional) type ids in the metadata provide a correspondence
between the union types / children and the values found in the types
buffer (data). As stated in the spec, the types buffer are 8-bit
signed integers. As I recall the reason that we used [ Int ] in the
metadata was that the Int type is thought to be easier for languages
to work with in general when serializing/deserializing the metadata.

Functionally these values are limited to the range [0, 127] and so we
should probably add some comments about this in Schema.fbs

> Regards
>
> Antoine.

Reply via email to