clarkzinzow opened a new issue, #35599:
URL: https://github.com/apache/arrow/issues/35599

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The [fixed-shape tensor extension 
type](https://arrow.apache.org/docs/python/extending_types.html#fixed-size-tensor)
 does not appear to be picklable. Given that pickling Arrow data is supported 
in general and is used in Python-centric systems such as Ray, supporting 
pickling for canonical extension types/arrays seems reasonable.
   
   ## Reproduction
   
   ```python
   pickle.loads(pickle.dumps(pa.fixed_shape_tensor(pa.int64(), (2, 2))))
   ```
   raises the error:
   ```
   KeyError                                  Traceback (most recent call last)
   File .../venv/lib/python3.9/site-packages/pyarrow/types.pxi:4798, in 
pyarrow.lib.type_for_alias()
   
   KeyError: 'extension<arrow.fixed_shape_tensor>'
   ```
   
   ```python
   tensor_type = pa.fixed_shape_tensor(pa.int32(), (2, 2))arr = [[1, 2, 3, 4], 
[10, 20, 30, 40], [100, 200, 300, 400]]
   arr = [[1, 2, 3, 4], [10, 20, 30, 40], [100, 200, 300, 400]]
   storage = pa.array(arr, pa.list_(pa.int32(), 4))
   tensor_array = pa.ExtensionArray.from_storage(tensor_type, storage)
   pickle.loads(pickle.dumps(tensor_array))
   ```
   raises the ~same error:
   ```
   KeyError                                  Traceback (most recent call last)
   File .../venv/lib/python3.9/site-packages/pyarrow/types.pxi:4798, in 
pyarrow.lib.type_for_alias()
   
   KeyError: 'extension<arrow.fixed_shape_tensor>'
   
   During handling of the above exception, another exception occurred:
   
   ValueError                                Traceback (most recent call last)
   Cell In[13], line 1
   ----> 1 pickle.loads(pickle.dumps(tensor_array))
   
   File .../venv/lib/python3.9/site-packages/pyarrow/types.pxi:4800, in 
pyarrow.lib.type_for_alias()
   
   ValueError: No type alias for extension<arrow.fixed_shape_tensor>
   ```
   
   ## Environment
   
   - pyarrow 12.0.0
   - Python 3.9
   - MacOS
   
   ## Possible Solution
   
   It seems like we might be able to implement `__reduce__` on 
[`FixedShapeTensorType`](https://github.com/apache/arrow/blob/2d76d9a526f9827283bb7dfac60715b6ad4aec34/python/pyarrow/types.pxi#L1511C12-L1587)
 such that it uses the `__arrow_ext_serialize__` serialization protocol? E.g.
   ```python
   def __reduce__(self):
       return FixedShapeTensorType.__arrow_ext_deserialize__, (self.storage, 
self.__arrow_ext_serialize__())
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to