paleolimbot commented on issue #40109:
URL: https://github.com/apache/arrow/issues/40109#issuecomment-1972314348

   It looks like the actual truncation happens here:
   
   
https://github.com/apache/arrow/blob/214378b522a36fbf6010e3d4f5470abaca7bf92e/r/src/r_to_arrow.cpp#L926
   
   The cast to the `c_type` as David noted, is a cast to an int64. On this 
line, one could check that you're not doing any truncation (I think you can use 
`std::modf()` for that). You would probably have to do something like count the 
number of lossy casts (e.g., `this->n_lossy_casts_++`) and issue the warning at 
the very end of the conversion.
   
   Perhaps the underlying cause is that we infer the unit of "seconds" by 
default. We could infer "milliseconds" or "microseconds" which would avoid 
truncation (or would limit it to thousandths or millionths of a second). I 
don't know why "seconds" is the default but a good fix for this might be to 
change it to "ms" or "us" (or add an `options()` to do so, perhaps migrating to 
a safer default over several versions with some warnings).
   
   ``` r
   arrow::infer_type(as.difftime(double(), units = "secs"))
   #> DurationType
   #> duration[s]
   ```
   
   A workaround could be to specify the type explicitly:
   
   ``` r
   delta <- as.difftime(c(0.000, 0.001, 0.002, 1, 1.5), units = "secs")
   delta |> 
     arrow::as_arrow_array(type = arrow::duration("ms"))
   #> Array
   #> <duration[ms]>
   #> [
   #>   0,
   #>   1,
   #>   2,
   #>   1000,
   #>   1500
   #> ]
   ```
   
   It looks like I inferred "microseconds" by default in nanoarrow although I 
forget the reasoning:
   
   ``` r
   library(nanoarrow)
   
   delta <- as.difftime(c(0.000, 0.001, 0.002, 1, 1.5), units = "secs")
   delta |> 
     as_nanoarrow_array() |> 
     arrow::as_arrow_array()
   #> Array
   #> <duration[us]>
   #> [
   #>   0,
   #>   1000,
   #>   2000,
   #>   1000000,
   #>   1500000
   #> ]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to