Alvaro-Kothe opened a new pull request, #49095:
URL: https://github.com/apache/arrow/pull/49095

   ### Rationale for this change
   This PR restores the behavior previous to version 23 for floating-point 
parsing on overflow and subnormal.
   
   `fast_float` didn't assign an error code on overflow in version `3.10.1` and 
assigned `±Inf` on overflow and `0.0` on subnormal. With the update to version 
`8.1`, it started to assign `std::errc::result_out_of_range` in such cases. 
   
   ### What changes are included in this PR?
   Ignores `std::errc::result_out_of_range` and produce `±Inf` / `0.0` as 
appropriate instead of failing the conversion.
   
   ### Are these changes tested?
   Yes. Created tests for overflow with positive and negative signed mantissa, 
and also created tests for subnormal, all of them for binary{16,32,64}.
   
   ### Are there any user-facing changes?
   It's a user facing change. The CSV reader on version `libarrow==23` was 
assigning them as strings, while before it was parsing it as `0` or `+- inf`.
   
   With this patch, the CSV reader in PyArrow outputs:
   
   ```python
   >>> import pyarrow
   >>> import pyarrow.csv
   >>> import io
   >>> table = 
pyarrow.csv.read_csv(io.BytesIO(f"data\n10E-617\n10E617\n-10E617".encode()))
   >>> print(table)
   pyarrow.Table
   data: double
   ----
   data: [[0,inf,-inf]]
   ```
   
   Closes #49003 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to