[ 
https://issues.apache.org/jira/browse/ARROW-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-564:
------------------------------
    Description: At the moment, for {{pyarrow.Array}} instances, we have a 
method called {{to_pandas}}. While this method returns NumPy Arrays, it returns 
them in the form that Pandas would use them in its {{Series}}. The difference 
here is visible for example in the case of integers with null values. For 
Pandas, we convert it into a float array and set all entries to NaN where we 
have null entries in the Arrow array. For vanilla NumPy arrays, we would return 
a tuple of a valid bytemap (not bitmap!) and a values array. The values array 
in this case should simply be a view on the underlying Arrow buffer.

> [Python] Add methods to return vanilla NumPy arrays (plus boolean mask array 
> if there are nulls)
> ------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-564
>                 URL: https://issues.apache.org/jira/browse/ARROW-564
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: beginner
>             Fix For: 1.0.0
>
>
> At the moment, for {{pyarrow.Array}} instances, we have a method called 
> {{to_pandas}}. While this method returns NumPy Arrays, it returns them in the 
> form that Pandas would use them in its {{Series}}. The difference here is 
> visible for example in the case of integers with null values. For Pandas, we 
> convert it into a float array and set all entries to NaN where we have null 
> entries in the Arrow array. For vanilla NumPy arrays, we would return a tuple 
> of a valid bytemap (not bitmap!) and a values array. The values array in this 
> case should simply be a view on the underlying Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to