[
https://issues.apache.org/jira/browse/ARROW-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537289#comment-16537289
]
Wes McKinney commented on ARROW-2806:
-------------------------------------
I guess it should raise unless {{safe=False}}. What do you think?
> [Python] Inconsistent handling of np.nan
> ----------------------------------------
>
> Key: ARROW-2806
> URL: https://issues.apache.org/jira/browse/ARROW-2806
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.9.0
> Reporter: Uwe L. Korn
> Priority: Major
> Fix For: 0.10.0
>
>
> Currently we handle {{np.nan}} differently between having a list or a numpy
> array as an input to {{pa.array()}}:
> {code}
> >>> pa.array(np.array([1, np.nan]))
> <pyarrow.lib.DoubleArray object at 0x11680bea8>
> [
> 1.0,
> nan
> ]
> >>> pa.array([1., np.nan])
> Out[9]:
> <pyarrow.lib.DoubleArray object at 0x10bdacbd8>
> [
> 1.0,
> NA
> ]
> {code}
> I would actually think the last one is the correct one. Especially once one
> casts this to an integer column. There the first one produces a column with
> INT_MIN and the second one produces a real null.
> But, in {{test_array_conversions_no_sentinel_values}} we check that
> {{np.nan}} does not produce a Null.
> Even weirder:
> {code}
> >>> df = pd.DataFrame({'a': [1., None]})
> >>> df
> a
> 0 1.0
> 1 NaN
> >>> pa.Table.from_pandas(df).column(0)
> <Column name='a' type=DataType(double)>
> chunk 0: <pyarrow.lib.DoubleArray object at 0x104bbf958>
> [
> 1.0,
> NA
> ]
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)