wesm commented on code in PR #37526:
URL: https://github.com/apache/arrow/pull/37526#discussion_r1323349402
##########
format/Message.fbs:
##########
@@ -99,6 +99,14 @@ table RecordBatch {
/// Optional compression of the message body
compression: BodyCompression;
+
+ /// Some types such as Utf8View are represented using a variable number of
buffers.
+ /// For each such Field in the pre-ordered flattened logical schema, there
will be
Review Comment:
I agree it is not strictly clear from reading
##########
docs/source/format/Columnar.rst:
##########
@@ -106,8 +106,10 @@ the different physical layouts defined by Arrow:
* **Primitive (fixed-size)**: a sequence of values each having the
same byte or bit width
* **Variable-size Binary**: a sequence of values each having a variable
- byte length. Two variants of this layout are supported using 32-bit
- and 64-bit length encoding.
+ byte length. Three variants of this layout are supported using
+ * 32-bit offset encoding
+ * 64-bit offset encoding
+ * 128-bit view-or-inline encoding
Review Comment:
I can see both sides of the argument. From the standpoint of "it encodes the
same type of data" (a variable-size binary value) then including it in this
list makes sense, and is perhaps somewhat less confusing for a reader of the
specification.
##########
docs/source/format/Columnar.rst:
##########
@@ -350,6 +352,53 @@ will be represented as follows: ::
|----------------|-----------------------|
| joemark | unspecified (padding) |
+.. versionadded:: Arrow Columnar Format 1.4
+
+Variable-size Binary View Layout
Review Comment:
I would agree that this should at least be made consistent with the decision
made above (perhaps just added as a subsection within Variable-size Binary).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]