Hi,

In the UnionArray, there is a level of indirection between types (buffer of
i8s) -> typeId (i8) -> field. For example, the generated_union part of our
integration tests has the data:

types: [5, 5, 5, 5, 7, 7, 7, 7, 5, 5, 7] (len = 11)
typeids: [5, 7]
fields: [int32, utf8]

My understanding is that, to get the field of item 4, we read types[4] (7),
look for the index of it in typeids (1), and take the field of index 1
(utf8), and then read the value (4 or other depending on sparsess).

Does someone know the rationale for the intermediare typeid? I.e. couldn't
the types contain the index of the field directly [0, 0, 0, 0, 1, 1, 1, 1,
0, 0,1] (replace 5 by 0, 7 by 1, and not use typeids)?

Best,
Jorge

Reply via email to