Albert Shieh created ARROW-2205:

             Summary: Option for integer object nulls
                 Key: ARROW-2205
             Project: Apache Arrow
          Issue Type: New Feature
          Components: C++, Python
            Reporter: Albert Shieh

I have a use case where the loss of precision in casting integers to floats 
matters, and pandas supports storing integers with nulls without loss of 
precision in object columns. However, a roundtrip through arrow will cast the 
object columns to float columns, even though the object columns are stored in 
arrow as integers with nulls.

This is a minimal example demonstrating the behavior of a roundtrip:
import numpy as np
import pandas as pd
import pyarrow as pa

df = pd.DataFrame({"a": np.array([None, 1], dtype=object)})
df_pa = pa.Table.from_pandas(df).to_pandas()

The output is:
0  None
1     1
0  NaN
1  1.0
This seems to be the desired behavior, given test_int_object_nulls in 

I think it would be useful to add an option in the to_pandas methods to allow 
integers with nulls to be returned as object columns. The option can default to 
false in order to preserve the current behavior.

This message was sent by Atlassian JIRA

Reply via email to