felipecrv commented on code in PR #37877:
URL: https://github.com/apache/arrow/pull/37877#discussion_r1343028440
##########
docs/source/format/Columnar.rst:
##########
@@ -487,6 +499,102 @@ will be represented as follows: ::
|-------------------------------|-----------------------|
| 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 | unspecified (padding) |
+ListView Layout
+~~~~~~~~~~~~~~~
+
+The ListView layout is defined by three buffers: a validity bitmap, an offsets
+buffer, and an additional sizes buffer. Sizes and offsets have the identical
bit
+width and both 32-bit and 64-bit signed integer options are supported.
+
+As in the List layout, the offsets encode the start position of each slot in
the
+child array. In contrast to the List layout, list lengths are stored explicitly
+in the sizes buffer instead of inferred. This allows offsets to be out of
order.
+Elements of the child array do not have to be stored in the same order they
+logically appear in the list elements of the parent array.
+
+When a value is null, the corresponding offset and size can have arbitrary
+values. When size is 0, the corresponding offset can have an arbitrary value.
+If choosing a value is possible, we recommend setting offsets and sizes to 0 in
Review Comment:
> In your example, surely you have to perform the bounds checks anyway as
the non-null elements might be shorted than displacement?
`displacement` is greater than `0` but `<=` all the non-null and non-empty
list-view offsets in the buffer. It's calculated by finding the smallest valid
offset on the source array before applying the transformation. Finding the
smallest requires checking null mask and `sizes[i]` buffer, applying the
transformation is where many alternatives are available.
I will check the C++ implementation again, but in the Go implementation I'm
doing all the checks.
Being more strict on the spec is more likely to constrain kernels
writing/transforming list-views, but hopefully we compensate for that in
efficiency of kernels reading list-views.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]