[ 
https://issues.apache.org/jira/browse/ARROW-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447730#comment-16447730
 ] 

ASF GitHub Bot commented on ARROW-564:
--------------------------------------

pitrou commented on a change in pull request #1931: ARROW-564 [Python] Add 
support for return zero copy NumPy arrays
URL: https://github.com/apache/arrow/pull/1931#discussion_r183308153
 
 

 ##########
 File path: python/pyarrow/tests/test_array.py
 ##########
 @@ -83,6 +83,33 @@ def test_long_array_format():
     assert result == expected
 
 
+def test_to_numpy_zero_copy():
+    import gc
+
+    arr = pa.array(range(10))
+
+    for i in range(10):
+        np_arr = arr.to_numpy()
+        assert sys.getrefcount(np_arr) == 2
+        np_arr = None  # noqa
+
+    assert sys.getrefcount(arr) == 2
+
+    for i in range(10):
+        arr = pa.array(range(10))
+        np_arr = arr.to_numpy()
+        arr = None
+        gc.collect()
+
+        # Ensure base is still valid
+
+        # Because of py.test's assert inspection magic, if you put getrefcount
+        # on the line being examined, it will be 1 higher than you expect
+        base_refcount = sys.getrefcount(np_arr.base)
+        assert base_refcount == 2
+        np_arr.sum()
 
 Review comment:
   You should check the result value.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Add methods to return vanilla NumPy arrays (plus boolean mask array 
> if there are nulls)
> ------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-564
>                 URL: https://issues.apache.org/jira/browse/ARROW-564
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: beginner, pull-request-available
>             Fix For: 1.0.0
>
>
> At the moment, for {{pyarrow.Array}} instances, we have a method called 
> {{to_pandas}}. While this method returns NumPy Arrays, it returns them in the 
> form that Pandas would use them in its {{Series}}. The difference here is 
> visible for example in the case of integers with null values. For Pandas, we 
> convert it into a float array and set all entries to NaN where we have null 
> entries in the Arrow array. For vanilla NumPy arrays, we would return a tuple 
> of a valid bytemap (not bitmap!) and a values array. The values array in this 
> case should simply be a view on the underlying Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to