pitrou commented on code in PR #41823:
URL: https://github.com/apache/arrow/pull/41823#discussion_r1681277787
##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -283,6 +283,148 @@ UUID
A specific UUID version is not required or guaranteed. This extension
represents
UUIDs as FixedSizeBinary(16) with big-endian notation and does not
interpret the bytes in any way.
+Opaque
+=======
+
+Opaque represents a type or array that an Arrow-based system received from an
+external (often non-Arrow) system, which it cannot interpret or did not have
+support for in advance. In this case, it can pass on Opaque to its clients to
+show that a field exists, but that it cannot interpret the field or data.
+
+Extension parameters:
+
+* Extension name: ``arrow.opaque``.
+
+* The storage type of this extension is any type. If there is no underlying
+ data, the storage type should be Null.
+
+* Extension type parameters:
+
+ * **type_name** = the name of the unknown type in the external system.
+ * **vendor_name** = the name of the external system.
+
+* Description of the serialization:
+
+ A valid JSON object containing the parameters as fields. In the future,
+ additional fields may be added, but all fields current and future are never
+ required to interpret the array.
+
+Rationale
+---------
+
Review Comment:
I think it would be nice to make the rationale shorter because, as it is,
many people will not read it at all.
For example:
> Arrow systems often wrap non-Arrow systems, and so they must be prepared
to handle data types and data that don't have an equivalent Arrow type. Some
columns in the original data may not be representable at all using Arrow, yet
we might still want to signal its presence.
>
> Instead of trying to make up a separate extension type for every possible
non-Arrow data type (including ones that the Arrow system was not expecting to
receive at all), the Opaque type can be used instead. Because it explicitly
means that we do not support a type, it can be used to declare an unsupported
field or column. In other words: if an Arrow system encounters a non-Arrow type
it was not prepared to handle, it can use Opaque to still pass the type on to a
client.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]