Hi folks, I would like to start a public discussion on the inclusion of a new array format to Arrow — array-view array. The name is also up for debate.
This format is inspired by Velox's ArrayVector format [1]. Logically, this array represents an array of arrays. Each element is an array-view (offset and size pair) that points to a range within a nested "values" array (called "elements" in Velox docs). The nested array can be of any type, which makes this format very flexible and powerful. [image: ../_images/array-vector.png] <https://facebookincubator.github.io/velox/_images/array-vector.png> I'm currently working on a C++ implementation and plan to work on a Go implementation to fulfill the two-implementations requirement for format changes. The draft design: - 3 buffers: [validity_bitmap, int32 offsets buffer, int32 sizes buffer] - 1 child array: "values" as an array of the type parameter validity_bitmap is used to differentiate between empty array views (sizes[i] == 0) and NULL array views (validity_bitmap[i] == 0). When the validity_bitmap[i] is 0, both sizes and offsets are undefined (as usual), and when sizes[i] == 0, offsets[i] is undefined. 0 is recommended if setting a value is not an issue to the system producing the arrays. offsets buffer is not required to be ordered and views don't have to be disjoint. [1] https://facebookincubator.github.io/velox/develop/vectors.html#arrayvector Thanks, Felipe O. Carvalho