[jira] [Created] (ARROW-10512) [Python] Arrow to Pandas conversion promotes child array to float for NULL values

Bryan Cutler (Jira) Fri, 06 Nov 2020 15:57:16 -0800

Bryan Cutler created ARROW-10512:
------------------------------------

             Summary: [Python] Arrow to Pandas conversion promotes child array 
to float for NULL values
                 Key: ARROW-10512
                 URL: https://issues.apache.org/jira/browse/ARROW-10512
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: Bryan Cutler



When converting a nested Arrow array to Pandas, if a child array is an integer 
type with NULL values, it gets promoted to floating point and NULL values are 
replaced with NaNs. Since the Pandas conversion for these types results in 
Python objects, it is not necessary to use NaN and `None` values could be 
inserted instead. This is the case for ListType, MapType and StructType, etc.

{code}
In [4]: s = pd.Series([[1, 2, 3], [4, 5, None]])

In [5]: arr = pa.Array.from_pandas(s)

In [6]: arr.type
Out[6]: ListType(list<item: int64>)

In [7]: arr.to_pandas()
Out[7]: 
0    [1.0, 2.0, 3.0]
1    [4.0, 5.0, nan]
dtype: object {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-10512) [Python] Arrow to Pandas conversion promotes child array to float for NULL values

Reply via email to