0x26res opened a new issue, #40750:
URL: https://github.com/apache/arrow/issues/40750
### Describe the bug, including details regarding any error messages,
version, and platform.
I'm trying to call `pa.MapArray.from_arrays`, in a simple example where I
create a MapArray of empty maps:
```
offsets = pa.array([0, 0, 0, 0, 0, 0], pa.int32())
map_array = pa.MapArray.from_arrays(
offsets,
pa.array([], pa.string()),
pa.array([], pa.string()),
)
assert len(map_array) == 5
assert map_array.to_pylist() == [[]] * 5
```
It works fine but if I try the same thing with an offsetted view of
`offsets` (aka a slice), it fails:
```
with pytest.raises(
pa.ArrowInvalid,
match=r"List child array invalid: Invalid: Struct child array #0 has
length smaller than expected for struct array \(0 < 1\)",
):
pa.MapArray.from_arrays(
offsets[1:],
pa.array([], pa.string()),
pa.array([], pa.string()),
)
```
For now as a work around I copy the data:
```
offset_shift = offsets[1:]
offset_shift_copy = pa.Array.from_buffers(
offset_shift.type, len(offset_shift), offset_shift.buffers()
)
map_array = pa.MapArray.from_arrays(
offset_shift_copy,
pa.array([], pa.string()),
pa.array([], pa.string()),
)
assert len(map_array) == 4
assert map_array.to_pylist() == [[]] * 4
```
My intuition is that the `offset` isn't interpreted correctly in the
underlying code, but I can't really tell where.
For context, this is happening when I try to do my own cast of a chunked
MapArray. I have to iterate through the chunks and change slightly the
underlying value array, and this happens.
Tested with pyarrow==15.0.2 / Python 3.11.8
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]