jorisvandenbossche commented on issue #44881: URL: https://github.com/apache/arrow/issues/44881#issuecomment-2535223337
@emanueledomingo thanks for the report! It's a curious bug not directly related to what you are doing here, but because of a buggy cast happening somewhere in pandas. But so the `cast` is buggy in pyarrow, and a reproducer with just pyarrow (using your `pa_table` from above: ``` In [59]: arr = pa_table["key1"] In [60]: arr2 = arr.cast(arr.type) In [61]: arr2 Out[61]: <pyarrow.lib.ChunkedArray object at 0x7f15c63f5c60> [ <Invalid array: List child array invalid: Invalid: Struct child array #0 has length smaller than expected for struct array (1 < 2)> ] In [62]: arr2[0] ... File ~/conda/envs/dev/lib/python3.11/site-packages/pyarrow/scalar.pxi:120, in pyarrow.lib.Scalar.__repr__() ... ArrowIndexError: index with value of 1 is out-of-bounds for array of length 1 ``` Casting this array to its own type results in an invalid array (and then on the pandas side in `to_dict`, it tries to iterate over the values, which then results in that error) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
