AlenkaF commented on code in PR #34883:
URL: https://github.com/apache/arrow/pull/34883#discussion_r1160513413
##########
python/pyarrow/array.pxi:
##########
@@ -3076,6 +3076,111 @@ cdef class ExtensionArray(Array):
return Array._to_pandas(self.storage, options, **kwargs)
+class FixedShapeTensorArray(ExtensionArray):
+ """
+ Concrete class for fixed shape tensor extension arrays.
+
+ Examples
+ --------
+ Define the extension type for tensor array
+
+ >>> import pyarrow as pa
+ >>> tensor_type = pa.fixed_shape_tensor(pa.int32(), [2, 2])
+
+ Create an extension array
+
+ >>> arr = [[1, 2, 3, 4], [10, 20, 30, 40], [100, 200, 300, 400]]
+ >>> storage = pa.array(arr, pa.list_(pa.int32(), 4))
+ >>> pa.ExtensionArray.from_storage(tensor_type, storage)
+ <pyarrow.lib.FixedShapeTensorArray object at ...>
+ [
+ [
+ 1,
+ 2,
+ 3,
+ 4
+ ],
+ [
+ 10,
+ 20,
+ 30,
+ 40
+ ],
+ [
+ 100,
+ 200,
+ 300,
+ 400
+ ]
+ ]
+ """
+
+ def to_numpy_ndarray(self):
+ """
+ Convert fixed shape tensor extension array to a numpy array (with
dim+1).
+ """
+ np_flat = np.asarray(self.storage.values)
+ numpy_tensor = np_flat.reshape((len(self),) + tuple(self.type.shape),
+ order='C')
Review Comment:
> No, the `self.type.permutation` is only that of the individual tensor
elements (given that you have a FixedShapeTensorArray, by definition the first
dimension of the n+1 dim ndarray is always the length of the array, and can't
be permuted)
Yes, thank you for confirming the thought I was currently struggling with!
Then I do not see the need for raising an error (as it would be raised for
every permutation that is not `None`) but maybe make it explicit in the
docstrings that the "0" dimension is by default not permutable and is fixed
(see the examples and description I added in the docs PR for this binding:
https://github.com/apache/arrow/pull/34957/files)
> Yes, and as mentioned before, I think we should certainly add examples how
the user can do this (if we decide to not automatically do it in
`to_numpy_ndarray`)
Done: https://github.com/apache/arrow/pull/34957/files
> I wouldn't raise a warning for that, since it's a fact of life that numpy
arrays don't have dimension names, so that's just a consequence of calling this
method. We can mention that in the docstring, though, to be explicit.
Agree, will make it explicit in the docstrings.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]