Is there any way to have pa.compute.cast handle int -> float64 with
accepted loss of precision?
Source value is a python int that's too long for int64, like
12345678901234567890, and I'd like to put into a float64 field in Arrow
table.
Using pyarrow 12.0.0:
pa.array([12345678901234567890], type=pa.float64())
-> ArrowInvalid: PyLong is too large to fit int64
Converting it myself works with expected loss of precision
pa.array([float(12345678901234567890)], type=pa.float64())
-> [1.2345678901234567e+19]
but I can't get pa.compute to do the same. Some examples:
pa.compute.cast([20033613169503999008], target_type=pa.float64(),
safe=False)
-> OverflowError: int too big to convert
pa.compute.cast(
[12345678901234567890],
options = pa.compute.CastOptions.unsafe(target_type=pa.float64())
)
-> OverflowError: int too big to convert
I tried the other options like int overflow and float truncate with no luck.
Asking Arrow to infer types hits the same error:
pa_array = pa.array([12345678901234567890])
-> OverflowError: int too big to convert
Cast to decimal128(38,0) works if it's set explicitly
pa.array([12345678901234567890], type=pa.decimal128(38, 0))
<pyarrow.lib.Decimal128Array object at 0x000001C45FAA8B80>
-> [12345678901234567890]
I'm working around it by doing the float() conversion myself, but this
is slower of course.