jonkeane commented on code in PR #12589:
URL: https://github.com/apache/arrow/pull/12589#discussion_r863731609
##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1471,3 +1471,128 @@ test_that("make_difftime()", {
)
)
})
+
+test_that("parse_date_time()", {
+ test_df <- tibble(
+ dates = c("09-01-17", "02-Sep-17")
+ )
+
+ test_dates <- tibble::tibble(
+ string_ymd = c(
+ "2021-09-1", "2021/09///2", "2021.09.03", "2021,09,4", "2021:09::5",
+ "2021 09 6", "21-09-07", "21/09/08", "21.09.9", "21,09,10", "21:09:11",
+ # not yet working for strings with no separators, like "20210917",
"210918" or "2021Sep19
+ # no separators and %b or %B are even more complicated (and they work in
+ # lubridate). not to mention locale
+ NA
+ ),
+ string_dmy = c(
+ "1-09-2021", "2/09//2021", "03.09.2021", "04,09,2021", "5:::09:2021",
+ "6 09 2021", "07-09-21", "08/09/21", "9.09.21", "10,09,21", "11:09:21",
+ # not yet working for strings with no separators, like "10092021",
"100921",
+ NA
+ ),
+ string_mdy = c(
+ "09-01-2021", "09/2/2021", "09.3.2021", "09,04,2021", "09:05:2021",
+ "09 6 2021", "09-7-21", "09/08/21", "09.9.21", "09,10,21", "09:11:21",
+ # not yet working for strings with no separators, like "09102021",
"091021",
+ NA
+ )
+ )
+
+ test_dates_locale <- tibble::tibble(
+ string_ymd = c(
+ "2021 Sep 12", "2021 September 13", "21 Sep 14", "21 September 15", NA
+ ),
+ string_dmy = c(
+ "12 Sep 2021", "13 September 2021", "14 Sep 21", "15 September 21", NA
+ ),
+ string_mdy = c(
+ "Sep 12 2021", "September 13 2021", "Sep 14 21", "September 15 21", NA
+ )
+ )
+
+ skip_if_not_available("re2")
+ compare_dplyr_binding(
+ .input %>%
+ mutate(
+ parsed_date_ymd = parse_date_time(string_ymd, orders = "ymd"),
+ parsed_date_dmy = parse_date_time(string_dmy, orders = "dmy"),
+ parsed_date_mdy = parse_date_time(string_mdy, orders = "mdy")
+ ) %>%
+ collect(),
+ test_dates
+ )
+
+ # locale not working properly on windows
+ # TODO revisit once https://issues.apache.org/jira/browse/ARROW-16399 is done
+ skip_on_os("windows")
Review Comment:
Yes, sorry I was looking at the original failures that had values for both
`test_dates` and `test_dates_locale` mixed together, but looking at this again
you do test just `test_dates_locale` here which is good. Also 💯 on getting a
Jira for this issue + linking it.
And honestly — to make this even more obvious — since you only use
`test_dates` and `test_dates_locale` you might consider using those definitions
directly as the `tbl` argument for `compare_dplyr_binding`. That way one
doesn't have to scroll up to see what's in either of them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]