ianmcook commented on a change in pull request #11176:
URL: https://github.com/apache/arrow/pull/11176#discussion_r711590655



##########
File path: r/tests/testthat/test-dplyr.R
##########
@@ -989,19 +989,86 @@ test_that("sign()", {
   )
 })
 
-test_that("ceiling(), floor(), trunc()", {
+test_that("ceiling(), floor(), trunc(), round()", {
   df <- tibble(x = c(-1, -0.55, -0.5, -0.1, 0, 0.1, 0.5, 0.55, 1, NA, NaN))
 
   expect_dplyr_equal(
     input %>%
       mutate(
         c = ceiling(x),
         f = floor(x),
-        t = trunc(x)
+        t = trunc(x),
+        r = round(x)
       ) %>%
       collect(),
     df
   )
+
+  # with digits set to 1
+  expect_dplyr_equal(
+    input %>%
+      filter(x %% 0.5 == 0) %>%
+      mutate(r = round(x, 1)) %>%
+      collect(),
+    df
+  )
+
+  # with digits set to -1
+  expect_dplyr_equal(
+    input %>%
+      mutate(
+        rd = round(floor(x * 111), -1), # double
+        y = ifelse(is.nan(x), NA_integer_, x),
+        ri = round(as.integer(y * 111), -1) # integer (with the NaN removed)
+      ) %>%
+      collect(),
+    df
+  )
+
+  # round(x, -2) is equivalent to round_to_multiple(x, 100)
+  expect_equal(
+    Table$create(x = 1111.1) %>%
+      mutate(r = round(x, -2)) %>%
+      collect(),
+    Table$create(x = 1111.1) %>%
+      mutate(r = arrow_round_to_multiple(x, options = list(multiple = 100))) 
%>%
+      collect()
+  )
+
+  skip_on_os("windows") # float representation error might cause inconsistency
+
+  expect_dplyr_equal(
+    input %>%
+      mutate(r = round(x, 1)) %>%
+      collect(),
+    df
+  )
+
+  # verify that round mode HALF_TO_EVEN which is what the round() binding uses
+  # yields results consistent with R...
+  expect_equal(
+    as.vector(
+      call_function(
+        "round",
+        Array$create(df$x),
+        options = list(ndigits = 1L, round_mode = RoundMode$HALF_TO_EVEN)
+      )
+    ),
+    round(df$x, 1)
+  )
+  # ... but round mode HALF_TOWARDS_ZERO does not.

Review comment:
       The R docs say:
   >Note that for rounding off a 5, the IEC 60559 standard (see also ‘IEEE 
754’) is expected to be used, ‘go to the even digit’. Therefore round(0.5) is 0 
and round(-1.5) is -2.
   
   So R's rounding mode is equivalent to the `HALF_TO_EVEN` mode in Arrow. The 
expectation beginning on line 1049 tests that. The expectation beginning on 
line 1060 effectively tests the test.
   
   In other words: I want to verify here that if we use a different round mode 
(such as `HALF_TOWARDS_ZERO`) it should return results that are _not_ equal to 
R's rounding behavior when applied to the test data. If someone in the future 
messes around with the test data, or increases the default tolerance of 
`all.equal()`, then the test beginning on line 1049 might still succeed but 
effectively not test what we want it to test. So the test on line 1060 is 
designed to fail if that happens.
   
   Currently the default tolerance of `all.equal()` is sufficient to detect 
that but not detect floating-point error propagation.
   
   The R docs go on to say:
   >However, this is dependent on OS services and on representation error 
(since e.g. 0.15 is not represented exactly, the rounding rule applies to the 
represented number and not to the printed number, and so round(0.15, 1) could 
be either 0.1 or 0.2).
   
   As usual "depending on OS services" means Windows does it the unexpected 
way, so this is skipped on Windows. It's also skipped on CRAN to avoid possible 
failures on the CRAN test machines.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to