jorisvandenbossche commented on issue #35717:
URL: https://github.com/apache/arrow/issues/35717#issuecomment-1558715711
> Based on how pyarrow handles numpy nans I'd expect this to get converted
to a pyarrow null.
We actually preserve NaNs by default when passing them like that:
```
In [16]: pa.array([float("nan")])
Out[16]:
<pyarrow.lib.DoubleArray object at 0x7f41b7c90ee0>
[
nan
]
```
It's only when you specify `from_pandas=True` that we convert NaN to null,
but this argument is set to True automatically when you pass a pandas object
(Series or Index, we should expand this with checking for an array as well ..),
see https://arrow.apache.org/docs/python/generated/pyarrow.array.html
That aside, the error you get here is a bit confusing, and that seems to
come from a bug in the precision/scale inference. If the first value is not a
NaN, we see a better error message:
```
In [17]: pa.array([Decimal("1.20"), Decimal("nan")])
...
ArrowInvalid: The string 'NaN' is not a valid decimal128 number
```
So we generally don't support NaN (or +/- Inf) for decimal data. Given that
we don't support it, we should maybe consider converting it to nulls instead
(or at least give the option to do so). Also casting float to decimal will
raise an error for NaN/Inf values:
```
In [32]: pa.array([1.2, 0.0]).cast(pa.decimal128(3, 2))
Out[32]:
<pyarrow.lib.Decimal128Array object at 0x7f41b7fd8c40>
[
1.20,
0.00
]
In [33]: pa.array([1.2, 0.0, np.nan]).cast(pa.decimal128(3, 2))
...
ArrowInvalid: Cannot convert nan to Decimal128
```
For the `array(..)` constructor, the `from_pandas=True` argument will
already ensure this gets converted to null:
```
In [34]: pa.array([Decimal("1.20"), Decimal("nan")], from_pandas=True)
Out[34]:
<pyarrow.lib.Decimal128Array object at 0x7f41c415cac0>
[
1.20,
null
]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]