[GitHub] [arrow] jorisvandenbossche commented on issue #35622: [Python] Fixed size lists of numeric types without nulls could be converted to numpy with zero-copy

via GitHub Wed, 24 May 2023 01:14:08 -0700


jorisvandenbossche commented on issue #35622:
URL: https://github.com/apache/arrow/issues/35622#issuecomment-1560654180


   The problem is that `to_numpy()` for a fixed size list array doesn't give 
you this flat (or nd) array of the values, but an object dtype array of 
sub-arrays:
   
   ```python
   >>> data.to_numpy(zero_copy_only=False)
   array([array([1, 2]), array([3, 4]), array([5, 6])], dtype=object)
   ```
   
   So while it is true that in case of numeric type without missing values, the 
underlying values are being converted zero copy, and the sub-arrays in the 
arrays above are zero-copy slices of this converted array, but the actual 
object-dtype numpy array that is returned in the snippet above is still a newly 
allocated array. 
   So it is a bit ambiguous here what "zero copy" would mean exactly.
   
   > But if I work with buffers directly, I can easily get it to work:
   
   Sidenote, there is actually another API to directly get this numpy array, 
without having to go through the buffers manually:
   
   ```python
   # the underlying values as a pyarrow array (only beware that this could 
contain "garbage" values in case of nulls)
   >>> data.values
   <pyarrow.lib.Int64Array object at 0x7f41b7682aa0>
   [
     1,
     2,
     3,
     4,
     5,
     6
   ]
   # in case of no missing values, this can be converted zero-copy to numpy
   >>> data.values.to_numpy()
   array([1, 2, 3, 4, 5, 6])
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on issue #35622: [Python] Fixed size lists of numeric types without nulls could be converted to numpy with zero-copy

Reply via email to