jorisvandenbossche opened a new issue, #35040: URL: https://github.com/apache/arrow/issues/35040
### Describe the enhancement requested See https://github.com/apache/arrow/issues/34901 for a longer discussion, but summarizing: the `pyarrow.Scalar` object has a `cast()` method, but in contrast with other cast methods in pyarrow it does an unsafe cast by default. We should probably change this to do a safe cast by default, and at the same time also allow to specify CastOptions (so a user can still choose to do an unsafe cast). Example: ```python # scalar behaviour >>> pa.scalar(1.5) <pyarrow.DoubleScalar: 1.5> >>> pa.scalar(1.5).cast(pa.int64()) <pyarrow.Int64Scalar: 1> # vs array behaviour >>> pa.array([1.5]).cast(pa.int64()) ... ArrowInvalid: Float value 1.5 was truncated converting to int64 ``` The python cast() method calls the C++ `Scalar::ToCast`: https://github.com/apache/arrow/blob/e488942cd552ac36a46d40477c1b0326a626ed98/cpp/src/arrow/scalar.h#L99-L100 which currently indeed doesn't have the option to pass CastOptions. In addition, it seems that for casting Scalars, we do have a somewhat custom implementation, and this doesn't use the generic Cast implementation (from the compute kernels), but has a custom `CastImpl` in scalar.cc. Not fully sure about the reason for this, but maybe historically we wanted to have scalar casting without relying on the optional compute module? (cfr https://github.com/apache/arrow/issues/25025) ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
