ianmcook commented on a change in pull request #11176:
URL: https://github.com/apache/arrow/pull/11176#discussion_r711590655
##########
File path: r/tests/testthat/test-dplyr.R
##########
@@ -989,19 +989,86 @@ test_that("sign()", {
)
})
-test_that("ceiling(), floor(), trunc()", {
+test_that("ceiling(), floor(), trunc(), round()", {
df <- tibble(x = c(-1, -0.55, -0.5, -0.1, 0, 0.1, 0.5, 0.55, 1, NA, NaN))
expect_dplyr_equal(
input %>%
mutate(
c = ceiling(x),
f = floor(x),
- t = trunc(x)
+ t = trunc(x),
+ r = round(x)
) %>%
collect(),
df
)
+
+ # with digits set to 1
+ expect_dplyr_equal(
+ input %>%
+ filter(x %% 0.5 == 0) %>%
+ mutate(r = round(x, 1)) %>%
+ collect(),
+ df
+ )
+
+ # with digits set to -1
+ expect_dplyr_equal(
+ input %>%
+ mutate(
+ rd = round(floor(x * 111), -1), # double
+ y = ifelse(is.nan(x), NA_integer_, x),
+ ri = round(as.integer(y * 111), -1) # integer (with the NaN removed)
+ ) %>%
+ collect(),
+ df
+ )
+
+ # round(x, -2) is equivalent to round_to_multiple(x, 100)
+ expect_equal(
+ Table$create(x = 1111.1) %>%
+ mutate(r = round(x, -2)) %>%
+ collect(),
+ Table$create(x = 1111.1) %>%
+ mutate(r = arrow_round_to_multiple(x, options = list(multiple = 100)))
%>%
+ collect()
+ )
+
+ skip_on_os("windows") # float representation error might cause inconsistency
+
+ expect_dplyr_equal(
+ input %>%
+ mutate(r = round(x, 1)) %>%
+ collect(),
+ df
+ )
+
+ # verify that round mode HALF_TO_EVEN which is what the round() binding uses
+ # yields results consistent with R...
+ expect_equal(
+ as.vector(
+ call_function(
+ "round",
+ Array$create(df$x),
+ options = list(ndigits = 1L, round_mode = RoundMode$HALF_TO_EVEN)
+ )
+ ),
+ round(df$x, 1)
+ )
+ # ... but round mode HALF_TOWARDS_ZERO does not.
Review comment:
The R docs say:
>Note that for rounding off a 5, the IEC 60559 standard (see also ‘IEEE
754’) is expected to be used, ‘go to the even digit’. Therefore round(0.5) is 0
and round(-1.5) is -2. However, this is dependent on OS services and on
representation error (since e.g. 0.15 is not represented exactly, the rounding
rule applies to the represented number and not to the printed number, and so
round(0.15, 1) could be either 0.1 or 0.2).
So R's rounding mode is equivalent to the `HALF_TO_EVEN` mode in Arrow. The
expectation beginning on line 1049 tests that. The expectation beginning on
line 1060 effectively tests the test.
In other words: I want to verify here that if we use a different round mode
(such as `HALF_TOWARDS_ZERO`) it should return results that are _not_ equal to
R's rounding behavior when applied to the test data. If someone in the future
messes around with the test data, or increases the default tolerance of
`all.equal()`, then the test beginning on line 1049 might still succeed but
effectively not test what we want it to test. So the test on line 1060 is
designed to fail if that happens.
Currently the default tolerance of `all.equal()` is sufficient to detect
that but not detect floating-point error propagation.
These tests are skipped on CRAN to avoid possible failures on CRAN test
machines that could hypothetically be caused by the dependency of R's rounding
behavior on OS services.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]