rok commented on code in PR #37166:
URL: https://github.com/apache/arrow/pull/37166#discussion_r1305610018


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -148,6 +148,60 @@ Fixed shape tensor
   by this specification. Instead, this extension type lets one use fixed shape 
tensors
   as elements in a field of a RecordBatch or a Table.
 
+.. _variable_shape_tensor_extension:
+
+Variable shape tensor
+=====================
+
+* Extension name: `arrow.variable_shape_tensor`.
+
+* The storage type of the extension is: ``StructArray`` where struct
+  is composed of **data** and **shape** fields describing a single
+  tensor per row:
+
+  * **data** is a ``List`` holding tensor elements of a single tensor.
+    Data type of the list elements is uniform across the entire column
+    and also provided in metadata.
+  * **shape** is a ``FixedSizeList`` of the tensor shape where
+    the size of the list is equal to the number of dimensions of the
+    tensor.
+
+* Extension type parameters:
+
+  * **value_type** = the Arrow data type of individual tensor elements.
+  * **ndim** = the number of dimensions of the tensor.
+
+  Optional parameters describing the logical layout:
+
+  * **dim_names** = same as `arrow.fixed_shape_tensor`
+
+  * **permutation** = same as in `arrow.fixed_shape_tensor`
+
+* Description of the serialization:
+
+  The metadata must be a valid JSON object including number of
+  dimensions of the contained tensors as an integer with key **"ndim"**
+  plus optional dimension names with keys **"dim_names"** and ordering of
+  the dimensions with key **"permutation"**.
+
+  - Example: ``{ "ndim": 2}``
+  - Example with ``dim_names`` metadata for NCHW ordered data:
+
+    ``{ "ndim": 3, "dim_names": ["C", "H", "W"]}``
+
+  - Example of permuted 3-dimensional tensor:
+
+    ``{ "ndim": 3, "permutation": [2, 0, 1]}``
+
+    This is the physical layout shape and the shape of the logical
+    layout would given an individual tensor of shape [100, 200, 500]
+    be ``[500, 100, 200]``.

Review Comment:
   Good point! `ndim` is provided by storage type and serializing `ndim` would 
be redundant. I'll refactor it out of JSON. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to