Brandon B. Miller created ARROW-9772:
----------------------------------------

             Summary: Optionally allow for to_pandas to return writeable pandas 
objects
                 Key: ARROW-9772
                 URL: https://issues.apache.org/jira/browse/ARROW-9772
             Project: Apache Arrow
          Issue Type: New Feature
          Components: Python
    Affects Versions: 0.17.1
            Reporter: Brandon B. Miller


In cuDF, I'd like to leverage pyarrow to facilitate the conversion from cuDF 
series and dataframe objects into the equivalent pandas objects. Concretely I'd 
like something like this to work:

 

`pandas_object = cudf_object.to_arrow().to_pandas()`. 

 

This allows us to stay consistent with the way the rest of the pyarrow 
ecosystem handles nulls, dtype conversions and the like without having to 
reinvent the wheel. However I noticed that in some zero copy scenarios, pyarrow 
doesn't seem to fully release the underlying buffers when converting 
`to_pandas()`. The resulting objects are immutable and if one tries to mutate 
the data they will encounter 

 

`ValueError: assignment destination is read-only`

 

This creates a slightly strange situation where a user might encounter issues 
that subtly stem from the fact that arrow was used to construct the offending 
pandas object. It would be nice to be able to toggle this behavior using a 
kwarg or something similar. I suspect this could come up in other situations 
where libraries want to convert back and forth between equivalent python 
objects through arrow and expect the final object they get to behave as if it 
were constructed via other means. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to