pitrou commented on code in PR #33925:
URL: https://github.com/apache/arrow/pull/33925#discussion_r1094381643


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -72,4 +72,30 @@ same rules as laid out above, and provide backwards 
compatibility guarantees.
 Official List
 =============
 
-No canonical extension types have been standardized yet.
+Fixed shape tensor
+==================
+
+* Extension name: `arrow.fixed_shape_tensor`.
+
+* The storage type of the extension: ``FixedSizeList`` where:
+
+  * **value_type** is the data type of individual tensors and
+    is an instance of ``pyarrow.DataType`` or ``pyarrow.Field``.
+  * **list_size** is the product of all the elements in tensor shape.
+
+* Extension type parameters:
+
+  * **value_type** = Arrow DataType of the tensor elements
+  * **shape** = shape of the contained tensors as a tuple
+  * **is_row_major** = boolean indicating the order of elements

Review Comment:
   I see two problems with a "is_row_major" flag:
   * when it's false, you don't know what the actual layout is (is it 
column-major, or is it in a non-trivial order like in the transpose() example)?
   * saying that it "should be interpreted as row-major" doesn't say what the 
dimensions actually are. If you are shipping arrays Numpy then it's not a 
problem, because dimensions are anonymous (you are just doing abstract math on 
tensors). But if you are shipping ML tensors, the information might be 
incomplete (IIUC, you can have row-major "NHWC" or row-major "NCHW").
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to