On Thu, 15 Aug 2019 11:17:07 -0700
Micah Kornfield <emkornfi...@gmail.com> wrote:
> >
> > In C++ they are
> > independent, we could have 32-bit array lengths and variable-length
> > types with 64-bit offsets if we wanted (we just wouldn't be able to
> > have a List child with more than INT32_MAX elements).  
> I think the point is we could do this in C++ but we don't.  I'm not sure we
> would have introduced the "Large" types if we did.

64-bit offsets take twice as much space as 32-bit offsets, so if you're
storing lots of small-ish lists or strings, 32-bit offsets are
preferrable.  So even with 64-bit array lengths from the start it would
still be beneficial to have types with 32-bit offsets.

> Going with the limited address space in Java and calling it a reference
> implementation seems suboptimal. If a consumer uses a "Large" type
> presumably it is because they need the ability to store more than INT32_MAX
> child elements in a column, otherwise it is just wasting space [1].

Probably. Though if the individual elements (lists or strings) are
large, not much space is wasted in proportion, so it may be simpler in
such a case to always create a "Large" type array.

> [1] I suppose theoretically there might be some performance benefits on
> 64-bit architectures to using the native word sizes.

Concretely, common 64-bit architectures don't do that, as 32-bit is an
extremely common integer size even in high-performance code.



Reply via email to