jorisvandenbossche commented on code in PR #33925: URL: https://github.com/apache/arrow/pull/33925#discussion_r1094563604
########## docs/source/format/CanonicalExtensions.rst: ########## @@ -72,4 +72,30 @@ same rules as laid out above, and provide backwards compatibility guarantees. Official List ============= -No canonical extension types have been standardized yet. +Fixed shape tensor +================== + +* Extension name: `arrow.fixed_shape_tensor`. + +* The storage type of the extension: ``FixedSizeList`` where: + + * **value_type** is the data type of individual tensors and + is an instance of ``pyarrow.DataType`` or ``pyarrow.Field``. + * **list_size** is the product of all the elements in tensor shape. + +* Extension type parameters: + + * **value_type** = Arrow DataType of the tensor elements + * **shape** = shape of the contained tensors as a tuple + +* Description of the serialization: + + The metadata must be a valid JSON object including shape of + the contained tensors as an array with key "shape". + + For example: `{ "shape": [2, 5]}` + +.. note:: + + Elements in an fixed shape tensor extension array are stored + in row-major/C-contiguous order. Review Comment: Do you have the same problem right now with FixedSizeList arrays? Or what is this converted to in R now? (can you convert that to a single matrix?) Also, even if we allow a different order or custom strides for each _individual tensor_, the full array backing the FixedSizeListArray (the flat values child array) still needs the first dimension (with size == length of the logical array) with the biggest strides. So if R doesn't support that, I don't think zero-copy conversion is ever possible? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org