Hi Felipe,

Thanks for the introduction. I'd be interested to hear about the
applications Velox has found for these vectors, and in what situations they
are useful. This could be contrasted with the current ListArray
implementations.

IIUC it would be fairly cheap to transform a ListArray to an ArrayView, but
expensive to go the other way.

Best,

Will Jones

On Tue, Apr 25, 2023 at 3:00 PM Felipe Oliveira Carvalho <
felipe...@gmail.com> wrote:

> Hi folks,
>
> I would like to start a public discussion on the inclusion of a new array
> format to Arrow — array-view array. The name is also up for debate.
>
> This format is inspired by Velox's ArrayVector format [1]. Logically, this
> array represents an array of arrays. Each element is an array-view (offset
> and size pair) that points to a range within a nested "values" array
> (called "elements" in Velox docs). The nested array can be of any type,
> which makes this format very flexible and powerful.
>
> [image: ../_images/array-vector.png]
> <https://facebookincubator.github.io/velox/_images/array-vector.png>
>
> I'm currently working on a C++ implementation and plan to work on a Go
> implementation to fulfill the two-implementations requirement for format
> changes.
>
> The draft design:
>
> - 3 buffers: [validity_bitmap, int32 offsets buffer, int32 sizes buffer]
> - 1 child array: "values" as an array of the type parameter
>
> validity_bitmap is used to differentiate between empty array views
> (sizes[i] == 0) and NULL array views (validity_bitmap[i] == 0).
>
> When the validity_bitmap[i] is 0, both sizes and offsets are undefined (as
> usual), and when sizes[i] == 0, offsets[i] is undefined. 0 is recommended
> if setting a value is not an issue to the system producing the arrays.
>
> offsets buffer is not required to be ordered and views don't have to be
> disjoint.
>
> [1]
> https://facebookincubator.github.io/velox/develop/vectors.html#arrayvector
>
> Thanks,
> Felipe O. Carvalho
>

Reply via email to