ianmcook commented on a change in pull request #11176:
URL: https://github.com/apache/arrow/pull/11176#discussion_r711590655



##########
File path: r/tests/testthat/test-dplyr.R
##########
@@ -989,19 +989,86 @@ test_that("sign()", {
   )
 })
 
-test_that("ceiling(), floor(), trunc()", {
+test_that("ceiling(), floor(), trunc(), round()", {
   df <- tibble(x = c(-1, -0.55, -0.5, -0.1, 0, 0.1, 0.5, 0.55, 1, NA, NaN))
 
   expect_dplyr_equal(
     input %>%
       mutate(
         c = ceiling(x),
         f = floor(x),
-        t = trunc(x)
+        t = trunc(x),
+        r = round(x)
       ) %>%
       collect(),
     df
   )
+
+  # with digits set to 1
+  expect_dplyr_equal(
+    input %>%
+      filter(x %% 0.5 == 0) %>%
+      mutate(r = round(x, 1)) %>%
+      collect(),
+    df
+  )
+
+  # with digits set to -1
+  expect_dplyr_equal(
+    input %>%
+      mutate(
+        rd = round(floor(x * 111), -1), # double
+        y = ifelse(is.nan(x), NA_integer_, x),
+        ri = round(as.integer(y * 111), -1) # integer (with the NaN removed)
+      ) %>%
+      collect(),
+    df
+  )
+
+  # round(x, -2) is equivalent to round_to_multiple(x, 100)
+  expect_equal(
+    Table$create(x = 1111.1) %>%
+      mutate(r = round(x, -2)) %>%
+      collect(),
+    Table$create(x = 1111.1) %>%
+      mutate(r = arrow_round_to_multiple(x, options = list(multiple = 100))) 
%>%
+      collect()
+  )
+
+  skip_on_os("windows") # float representation error might cause inconsistency
+
+  expect_dplyr_equal(
+    input %>%
+      mutate(r = round(x, 1)) %>%
+      collect(),
+    df
+  )
+
+  # verify that round mode HALF_TO_EVEN which is what the round() binding uses
+  # yields results consistent with R...
+  expect_equal(
+    as.vector(
+      call_function(
+        "round",
+        Array$create(df$x),
+        options = list(ndigits = 1L, round_mode = RoundMode$HALF_TO_EVEN)
+      )
+    ),
+    round(df$x, 1)
+  )
+  # ... but round mode HALF_TOWARDS_ZERO does not.

Review comment:
       The R docs say:
   >Note that for rounding off a 5, the IEC 60559 standard (see also ‘IEEE 
754’) is expected to be used, ‘go to the even digit’. Therefore round(0.5) is 0 
and round(-1.5) is -2.
   
   So R's rounding mode is equivalent to the `HALF_TO_EVEN` mode in Arrow. The 
expectation beginning on line 1049 tests that. The expectation beginning on 
line 1060 effectively tests the test.
   
   In other words: I want to verify here that if we use a different round mode 
(such as `HALF_TOWARDS_ZERO`) it should return results that are _not_ equal to 
R's rounding behavior when applied to the test data. If someone in the future 
messes around with the test data, or increases the default tolerance of 
`all.equal()`, then the test beginning on line 1049 might still succeed but 
effectively not test what we want it to test. So the test on line 1060 is 
designed to fail if that happens.
   
   Currently the default tolerance of `all.equal()` is sufficient to detect 
that but not detect floating-point error propagation.
   
   The R docs go on to say:
   >However, this is dependent on OS services and on representation error 
(since e.g. 0.15 is not represented exactly, the rounding rule applies to the 
represented number and not to the printed number, and so round(0.15, 1) could 
be either 0.1 or 0.2).
   
   This does not seem to be causing any problems in our CI systems, but I set 
these tests to be skipped on CRAN to avoid possible failures on the CRAN test 
machines that could hypothetically be caused by the dependency of R's rounding 
behavior on OS services.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to