jorisvandenbossche opened a new issue, #35040:
URL: https://github.com/apache/arrow/issues/35040

   ### Describe the enhancement requested
   
   See https://github.com/apache/arrow/issues/34901 for a longer discussion, 
but summarizing: the `pyarrow.Scalar` object has a `cast()` method, but in 
contrast with other cast methods in pyarrow it does an unsafe cast by default. 
We should probably change this to do a safe cast by default, and at the same 
time also allow to specify CastOptions (so a user can still choose to do an 
unsafe cast). 
   
   Example:
   
   ```python
   # scalar behaviour
   >>> pa.scalar(1.5)
   <pyarrow.DoubleScalar: 1.5>
   >>> pa.scalar(1.5).cast(pa.int64())
   <pyarrow.Int64Scalar: 1>
   
   # vs array behaviour
   >>> pa.array([1.5]).cast(pa.int64())
   ...
   ArrowInvalid: Float value 1.5 was truncated converting to int64
   ```
   
   The python cast() method calls the C++ `Scalar::ToCast`:
   
   
https://github.com/apache/arrow/blob/e488942cd552ac36a46d40477c1b0326a626ed98/cpp/src/arrow/scalar.h#L99-L100
   
   which currently indeed doesn't have the option to pass CastOptions.
   
   In addition, it seems that for casting Scalars, we do have a somewhat custom 
implementation, and this doesn't use the generic Cast implementation (from the 
compute kernels), but has a custom `CastImpl` in scalar.cc. Not fully sure 
about the reason for this, but maybe historically we wanted to have scalar 
casting without relying on the optional compute module? (cfr 
https://github.com/apache/arrow/issues/25025)
   
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to