danepitkin commented on issue #34901:
URL: https://github.com/apache/arrow/issues/34901#issuecomment-1499584242

   > @danepitkin thank you for the clarification. In numpy however, the cast 
succeeds, it seems as if full value is preserved:
   > 
   > > > > np.array([18014398509481984]).astype("float64")
   > > > > array([1.80143985e+16])
   > 
   > Is their an internal difference in how double values are stored between 
arrow and numpy that would cause the difference?
   
   In your example, 18,014,398,509,481,984 can be converted to float64 safely 
according to the floating point specification so it is not a good example to 
use. Instead let's try 18,014,398,509,481,983, which is not a multiple of 2 
(required by integers between 2^53 and 2^54 for safe conversion).
   
   You will lose data in this numpy cast. (And yes, my guess is they adhere to 
the floating point spec slightly differently purely based on the different 
behavior).
   ```
   >>> np.array([18014398509481983]).astype("float64").astype("int64") == 
18014398509481983
   array([False])
   
   >>> np.array([18014398509481983]).astype("float64").astype("int64")
   array([18014398509481984])
   ```
   
   Numpy defaults to unsafe casting 
(https://numpy.org/doc/stable/reference/generated/numpy.ndarray.astype.html), 
but it seems it also doesn't perform safety checks properly all of the time.
   ```
   >>> type(np.array([18014398509481984])[0])
   <class 'numpy.int64'>
   
   
   # Bug? No safety error for int64 -> float64
   >>> np.array([18014398509481983]).astype("float64", 
casting="safe").astype("int64") == 18014398509481983
   array([False])
   
   
   # Good? Errors out on float64 -> int64, but the bug happened in the int64 -> 
float64 and was somehow propagated..
   >>> np.array([18014398509481983]).astype("float64", 
casting="safe").astype("int64", casting="safe") == 18014398509481983
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   TypeError: Cannot cast array data from dtype('float64') to dtype('int64') 
according to the rule 'safe'
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to