Does creating a decimal128 array, then casting that array to float64 work? On Mon, May 8, 2023 at 3:08 PM Chris Comeau <[email protected]> wrote:
> Is there any way to have pa.compute.cast handle int -> float64 with > accepted loss of precision? > > Source value is a python int that's too long for int64, like > 12345678901234567890, and I'd like to put into a float64 field in Arrow > table. > > Using pyarrow 12.0.0: > > pa.array([12345678901234567890], type=pa.float64()) > -> ArrowInvalid: PyLong is too large to fit int64 > > Converting it myself works with expected loss of precision > pa.array([float(12345678901234567890)], type=pa.float64()) > > -> [1.2345678901234567e+19] > > > but I can't get pa.compute to do the same. Some examples: > > > pa.compute.cast([20033613169503999008], target_type=pa.float64(), > safe=False) > -> OverflowError: int too big to convert > > pa.compute.cast( > [12345678901234567890], > options = pa.compute.CastOptions.unsafe(target_type=pa.float64()) > ) > -> OverflowError: int too big to convert > > I tried the other options like int overflow and float truncate with no > luck. > > Asking Arrow to infer types hits the same error: > pa_array = pa.array([12345678901234567890]) > -> OverflowError: int too big to convert > > Cast to decimal128(38,0) works if it's set explicitly > pa.array([12345678901234567890], type=pa.decimal128(38, 0)) > > <pyarrow.lib.Decimal128Array object at 0x000001C45FAA8B80> > -> [12345678901234567890] > > I'm working around it by doing the float() conversion myself, but this is > slower of course. > >
