[ 
https://issues.apache.org/jira/browse/ARROW-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375704#comment-16375704
 ] 

ASF GitHub Bot commented on ARROW-2205:
---------------------------------------

xhochy commented on issue #1650: ARROW-2205: [Python] Option for integer object 
nulls
URL: https://github.com/apache/arrow/pull/1650#issuecomment-368247892
 
 
   @wesm I think using `kwargs` seems to be the most pythonic way to do this. 
With Pandas I also wondered in the beginning over the large number of kwargs 
but in the end, it seems like a good-enough solution.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Option for integer object nulls
> ----------------------------------------
>
>                 Key: ARROW-2205
>                 URL: https://issues.apache.org/jira/browse/ARROW-2205
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++, Python
>    Affects Versions: 0.8.0
>            Reporter: Albert Shieh
>            Priority: Major
>              Labels: pull-request-available
>
> I have a use case where the loss of precision in casting integers to floats 
> matters, and pandas supports storing integers with nulls without loss of 
> precision in object columns. However, a roundtrip through arrow will cast the 
> object columns to float columns, even though the object columns are stored in 
> arrow as integers with nulls.
> This is a minimal example demonstrating the behavior of a roundtrip:
> {code}
> import numpy as np
> import pandas as pd
> import pyarrow as pa
> df = pd.DataFrame({"a": np.array([None, 1], dtype=object)})
> df_pa = pa.Table.from_pandas(df).to_pandas()
> print(df)
> print(df_pa)
> {code}
> The output is:
> {code}
>       a
> 0  None
> 1     1
>      a
> 0  NaN
> 1  1.0
> {code}
> This seems to be the desired behavior, given test_int_object_nulls in 
> test_convert_pandas.
> I think it would be useful to add an option in the to_pandas methods to allow 
> integers with nulls to be returned as object columns. The option can default 
> to false in order to preserve the current behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to