spenczar commented on issue #37004:
URL: https://github.com/apache/arrow/issues/37004#issuecomment-1682953396
Thanks for taking a look @AlenkaF. Here is a reproducer for FixedShapeTensor
- it only happens if you go through the struct. The issue is mis-titled. It's
actually a bug with Tables which contain a Struct field which have an
ExtensionType which has a ListTyped-storage (whew, that's a lot!):
```py
In [38]: import pyarrow as pa
In [39]: tensor_type = pa.fixed_shape_tensor(pa.int32(), (2, 2))
In [40]: arr = [[1, 2, 3, 4], [10, 20, 30, 40], [100, 200, 300, 400]]
In [41]: storage = pa.array(arr, pa.list_(pa.int32(), 4))
In [42]: tensor_array = pa.ExtensionArray.from_storage(tensor_type, storage)
...:
In [43]: pa.StructArray.from_arrays([tensor_array], "x")
Out[43]:
<pyarrow.lib.StructArray object at 0x154e5a680>
-- is_valid: all not null
-- child 0 type: extension<arrow.fixed_shape_tensor>
[
[
1,
2,
3,
4
],
[
10,
20,
30,
40
],
[
100,
200,
300,
400
]
]
In [44]: struct_array = pa.StructArray.from_arrays([tensor_array], "x")
In [45]: table = pa.Table.from_arrays([struct_array], ["field1"])
In [46]: table.cast(table.schema)
---------------------------------------------------------------------------
ArrowInvalid Traceback (most recent call last)
Cell In[46], line 1
----> 1 table.cast(table.schema)
File
~/.pyenv/versions/3.10.12/envs/adam-core/lib/python3.10/site-packages/pyarrow/table.pxi:3616,
in pyarrow.lib.Table.cast()
File
~/.pyenv/versions/3.10.12/envs/adam-core/lib/python3.10/site-packages/pyarrow/table.pxi:3798,
in pyarrow.lib.Table.from_arrays()
File
~/.pyenv/versions/3.10.12/envs/adam-core/lib/python3.10/site-packages/pyarrow/table.pxi:2962,
in pyarrow.lib.Table.validate()
File
~/.pyenv/versions/3.10.12/envs/adam-core/lib/python3.10/site-packages/pyarrow/error.pxi:100,
in pyarrow.lib.check_status()
ArrowInvalid: Column 0: In chunk 0: Invalid: Struct child array #0 invalid:
Invalid: Expected 1 child arrays in array of type fixed_size_list<item:
int32>[4], got 0
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]