jorisvandenbossche commented on issue #29892:
URL: https://github.com/apache/arrow/issues/29892#issuecomment-1481040252

   To be explicit, there is no "internal" fix to be done, as this conversion is 
already possible zero copy with preserving the dtype, _if_ you convert the flat 
values (i.e. what Antoine showed above):
   
   ```
   >>> a = pa.array([[1,2,3], [4,5,6]])
   >>> a.flatten().to_numpy()
   array([1, 2, 3, 4, 5, 6])
   >>> a.flatten().to_numpy().reshape(2, 3)
   array([[1, 2, 3],
          [4, 5, 6]])
   ```
   
   But so it is more a question about what user facing API we provide for this. 
Do we expect the user to do this themselves, or do we want to add some 
"to_numpy_2d" method to FixedSizeListArray that does that for you? 
   The existing `to_numpy` cannot do this, because this method is expected to 
give you a 1D array of the same length as the pyarrow array. I personally would 
lean towards letting the user do this themselves, since this is relatively 
straightforward to do and then you have full control (a method to get a 2D 
array would also get messy if you have a list array with multiple levels of 
nesting). So regarding the original topic, I would tend to close this issue.
   
   But @westonpace makes a good point that the FixedShapeTensorArray extension 
type that is being added might be interesting, depending on your exact use 
case. The pyarrow API for that still needs to be finalized and merged, but we 
were planning to add a `to_numpy_array` method (or some other name) that gives 
you the actual underlying array zero-copy as a N-d array. See the examples in 
the documentation that is being added in 
https://github.com/apache/arrow/pull/33948
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to