jorisvandenbossche commented on issue #26199:
URL: https://github.com/apache/arrow/issues/26199#issuecomment-1481065636

   I am not sure we can do anything about this, since this is an inherent 
limitation of numpy not having missing value support for integers, and so that 
we have to use float64 to represent those. The same happens for primitive, 
non-nested arrays as well:
   
   ```python
   data = [None, 9007199254740993]
   arr = pa.array(data, type=pa.uint64())
   ndarray = arr.to_numpy(zero_copy_only=False)
   
   >>> arr
   <pyarrow.lib.UInt64Array object at 0x7fa6efe29400>
   [
     null,
     9007199254740993
   ]
   >>> ndarray
   array([           nan, 9.00719925e+15])
   ```
   
   One difference, though, is that when trying to recreate, we raise an error 
instead of silently roundtripping a different value:
   
   ```python
   >>> restored = pa.array(ndarray, type=arr.type)
   ...
   ArrowInvalid: Float value nan was truncated converting to uint64
   ```
   
   That's related to the fact that we don't do safe casting when converting 
nested data, which is discussed in 
https://github.com/apache/arrow/issues/31857. 
   
   Will add this issue as an extra example there, and then we can close this.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to