jorisvandenbossche commented on issue #35622:
URL: https://github.com/apache/arrow/issues/35622#issuecomment-1560654180
The problem is that `to_numpy()` for a fixed size list array doesn't give
you this flat (or nd) array of the values, but an object dtype array of
sub-arrays:
```python
>>> data.to_numpy(zero_copy_only=False)
array([array([1, 2]), array([3, 4]), array([5, 6])], dtype=object)
```
So while it is true that in case of numeric type without missing values, the
underlying values are being converted zero copy, and the sub-arrays in the
arrays above are zero-copy slices of this converted array, but the actual
object-dtype numpy array that is returned in the snippet above is still a newly
allocated array.
So it is a bit ambiguous here what "zero copy" would mean exactly.
> But if I work with buffers directly, I can easily get it to work:
Sidenote, there is actually another API to directly get this numpy array,
without having to go through the buffers manually:
```python
# the underlying values as a pyarrow array (only beware that this could
contain "garbage" values in case of nulls)
>>> data.values
<pyarrow.lib.Int64Array object at 0x7f41b7682aa0>
[
1,
2,
3,
4,
5,
6
]
# in case of no missing values, this can be converted zero-copy to numpy
>>> data.values.to_numpy()
array([1, 2, 3, 4, 5, 6])
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]