This is an automated email from the ASF dual-hosted git repository.
jorisvandenbossche pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 5f3688351f MINOR: [Format] Clarify that the buffers for the Binary
View layout differ in the C Data Interface (#40156)
5f3688351f is described below
commit 5f3688351f3adfba9a84d9e0bd65b300eabe35d2
Author: Joris Van den Bossche <[email protected]>
AuthorDate: Tue Feb 27 09:15:55 2024 +0100
MINOR: [Format] Clarify that the buffers for the Binary View layout differ
in the C Data Interface (#40156)
### Rationale for this change
Attempt to draw more attention to the fact that the buffer listing / number
of buffers differ between the main Format spec and the C Data Interface, for
the Binary View layout.
Triggered by feedback from implementing this in duckdb at
https://github.com/duckdb/duckdb/pull/10481#discussion_r1489245865
Authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
---
docs/source/format/CDataInterface.rst | 7 ++++++-
docs/source/format/Columnar.rst | 3 +++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/docs/source/format/CDataInterface.rst
b/docs/source/format/CDataInterface.rst
index ef4bf1cf32..fd9952b037 100644
--- a/docs/source/format/CDataInterface.rst
+++ b/docs/source/format/CDataInterface.rst
@@ -467,7 +467,10 @@ It has the following fields:
Mandatory. The number of physical buffers backing this array. The
number of buffers is a function of the data type, as described in the
- :ref:`Columnar format specification <format_columnar>`.
+ :ref:`Columnar format specification <format_columnar>`, except for the
+ the binary or utf-8 view type, which has one additional buffer compared
+ to the Columnar format specification (see
+ :ref:`c-data-interface-binary-view-arrays`).
Buffers of children arrays are not included.
@@ -552,6 +555,8 @@ parameterized extension types).
The ``ArrowArray`` structure exported from an extension array simply points
to the storage data of the extension array.
+.. _c-data-interface-binary-view-arrays:
+
Binary view arrays
------------------
diff --git a/docs/source/format/Columnar.rst b/docs/source/format/Columnar.rst
index 84f251968f..7b74b972f2 100644
--- a/docs/source/format/Columnar.rst
+++ b/docs/source/format/Columnar.rst
@@ -409,6 +409,9 @@ All integers (length, buffer index, and offset) are signed.
This layout is adapted from TU Munich's `UmbraDB`_.
+Note that this layout uses one additional buffer to store the variadic buffer
+lengths in the :ref:`Arrow C data interface
<c-data-interface-binary-view-arrays>`.
+
.. _variable-size-list-layout:
Variable-size List Layout