[ 
https://issues.apache.org/jira/browse/ARROW-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447731#comment-16447731
 ] 

ASF GitHub Bot commented on ARROW-564:
--------------------------------------

pitrou commented on a change in pull request #1931: ARROW-564 [Python] Add 
support for return zero copy NumPy arrays
URL: https://github.com/apache/arrow/pull/1931#discussion_r183309840
 
 

 ##########
 File path: python/pyarrow/tests/test_array.py
 ##########
 @@ -83,6 +83,33 @@ def test_long_array_format():
     assert result == expected
 
 
+def test_to_numpy_zero_copy():
 
 Review comment:
   This function isn't actually testing the zero-copy part. You should mutate 
the result Numpy array and check the original Arrow array is mutated (of 
course, the fact we're able to get a mutable Numpy array from an Arrow array 
could be seen as a bug).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Add methods to return vanilla NumPy arrays (plus boolean mask array 
> if there are nulls)
> ------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-564
>                 URL: https://issues.apache.org/jira/browse/ARROW-564
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: beginner, pull-request-available
>             Fix For: 1.0.0
>
>
> At the moment, for {{pyarrow.Array}} instances, we have a method called 
> {{to_pandas}}. While this method returns NumPy Arrays, it returns them in the 
> form that Pandas would use them in its {{Series}}. The difference here is 
> visible for example in the case of integers with null values. For Pandas, we 
> convert it into a float array and set all entries to NaN where we have null 
> entries in the Arrow array. For vanilla NumPy arrays, we would return a tuple 
> of a valid bytemap (not bitmap!) and a values array. The values array in this 
> case should simply be a view on the underlying Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to