clarkzinzow opened a new issue, #35599: URL: https://github.com/apache/arrow/issues/35599
### Describe the bug, including details regarding any error messages, version, and platform. The [fixed-shape tensor extension type](https://arrow.apache.org/docs/python/extending_types.html#fixed-size-tensor) does not appear to be picklable. Given that pickling Arrow data is supported in general and is used in Python-centric systems such as Ray, supporting pickling for canonical extension types/arrays seems reasonable. ## Reproduction ```python pickle.loads(pickle.dumps(pa.fixed_shape_tensor(pa.int64(), (2, 2)))) ``` raises the error: ``` KeyError Traceback (most recent call last) File .../venv/lib/python3.9/site-packages/pyarrow/types.pxi:4798, in pyarrow.lib.type_for_alias() KeyError: 'extension<arrow.fixed_shape_tensor>' ``` ```python tensor_type = pa.fixed_shape_tensor(pa.int32(), (2, 2))arr = [[1, 2, 3, 4], [10, 20, 30, 40], [100, 200, 300, 400]] arr = [[1, 2, 3, 4], [10, 20, 30, 40], [100, 200, 300, 400]] storage = pa.array(arr, pa.list_(pa.int32(), 4)) tensor_array = pa.ExtensionArray.from_storage(tensor_type, storage) pickle.loads(pickle.dumps(tensor_array)) ``` raises the ~same error: ``` KeyError Traceback (most recent call last) File .../venv/lib/python3.9/site-packages/pyarrow/types.pxi:4798, in pyarrow.lib.type_for_alias() KeyError: 'extension<arrow.fixed_shape_tensor>' During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) Cell In[13], line 1 ----> 1 pickle.loads(pickle.dumps(tensor_array)) File .../venv/lib/python3.9/site-packages/pyarrow/types.pxi:4800, in pyarrow.lib.type_for_alias() ValueError: No type alias for extension<arrow.fixed_shape_tensor> ``` ## Environment - pyarrow 12.0.0 - Python 3.9 - MacOS ## Possible Solution It seems like we might be able to implement `__reduce__` on [`FixedShapeTensorType`](https://github.com/apache/arrow/blob/2d76d9a526f9827283bb7dfac60715b6ad4aec34/python/pyarrow/types.pxi#L1511C12-L1587) such that it uses the `__arrow_ext_serialize__` serialization protocol? E.g. ```python def __reduce__(self): return FixedShapeTensorType.__arrow_ext_deserialize__, (self.storage, self.__arrow_ext_serialize__()) ``` ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
