Jorge,
I think your analysis is correct.  Some historical context on why there is
an indication  is covered on the original JIRA:
https://issues.apache.org/jira/browse/ARROW-257

Some other discussions:
https://lists.apache.org/x/thread.html/75028183d54cb4f6ff588b043fe126f10b2cba8e373673fad6ba889d@%3Cdev.arrow.apache.org%3E
https://lists.apache.org/x/thread.html/b219ef51dda71bef83dcdec94e68e2881d49f751b29a8c1251f653d5@%3Cdev.arrow.apache.org%3E

-Micah

On Fri, Aug 13, 2021 at 10:57 AM Keith Kraus <keith.j.kr...@gmail.com>
wrote:

> How would using the typeid directly work with arbitrary Extension types?
>
> -Keith
>
> On Fri, Aug 13, 2021 at 12:49 PM Jorge Cardoso Leitão <
> jorgecarlei...@gmail.com> wrote:
>
> > Hi,
> >
> > In the UnionArray, there is a level of indirection between types (buffer
> of
> > i8s) -> typeId (i8) -> field. For example, the generated_union part of
> our
> > integration tests has the data:
> >
> > types: [5, 5, 5, 5, 7, 7, 7, 7, 5, 5, 7] (len = 11)
> > typeids: [5, 7]
> > fields: [int32, utf8]
> >
> > My understanding is that, to get the field of item 4, we read types[4]
> (7),
> > look for the index of it in typeids (1), and take the field of index 1
> > (utf8), and then read the value (4 or other depending on sparsess).
> >
> > Does someone know the rationale for the intermediare typeid? I.e.
> couldn't
> > the types contain the index of the field directly [0, 0, 0, 0, 1, 1, 1,
> 1,
> > 0, 0,1] (replace 5 by 0, 7 by 1, and not use typeids)?
> >
> > Best,
> > Jorge
> >
>

Reply via email to