randolf-scholz opened a new issue, #48972:
URL: https://github.com/apache/arrow/issues/48972

   ### Describe the enhancement requested
   
   I'd like to cast a `string` array to `float`, but it can contain bad entried.
   
   ```python
   import pyarrow as pa
   import pyarrow.compute as pc
   
   arr = pa.array(["1.2", "3", "10-20", None, "nan", ""])
   
   out = pc.cast(arr, pa.float64(), safe=False)  # raises ArrowInvalid
   
   print(out)  # E: [1.2, 3, null, null, nan, null]
   ```
   
   My current workaround is to export to `pandas` and use 
[`pandas.to_numeric(errors="coerce")`](https://pandas.pydata.org/docs/reference/api/pandas.to_numeric.html#pandas-to-numeric).
   However, it would be nice if `pyarrow` had some built-in machinery to deal 
with this situation:
   
   1. A `errors={"raise", "coerce"}` option like `pandas.to_numeric` that 
catches conversion errors other than overflow and truncation.
   
   2. Add a function that yields a boolean mask of all values that are castable.
   
       ```python
       def is_castable(arr, target_type, options=None) -> Array[bool]:
           """Returns boolean mask of values that can be cast to target_type,
           under the chosen options."""
       ```
       
       Such a function would also be useful for extracting the set of all 
values that cannot be cast.
       
   
   
   
   ### Component(s)
   
   C++, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to