dragosmg commented on code in PR #12589:
URL: https://github.com/apache/arrow/pull/12589#discussion_r861921709


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1471,3 +1471,128 @@ test_that("make_difftime()", {
     )
   )
 })
+
+test_that("parse_date_time()", {
+  test_df <- tibble(
+    dates = c("09-01-17", "02-Sep-17")
+  )
+
+  test_dates <- tibble::tibble(
+    string_ymd = c(
+      "2021-09-1", "2021/09///2", "2021.09.03", "2021,09,4", "2021:09::5",
+      "2021 09   6", "21-09-07", "21/09/08", "21.09.9", "21,09,10", "21:09:11",
+      # not yet working for strings with no separators, like "20210917", 
"210918" or "2021Sep19
+      # no separators and %b or %B are even more complicated (and they work in
+      # lubridate). not to mention locale
+      NA
+    ),
+    string_dmy = c(
+      "1-09-2021", "2/09//2021", "03.09.2021", "04,09,2021", "5:::09:2021",
+      "6  09  2021", "07-09-21", "08/09/21", "9.09.21", "10,09,21", "11:09:21",
+      # not yet working for strings with no separators, like "10092021", 
"100921",
+      NA
+    ),
+    string_mdy = c(
+      "09-01-2021", "09/2/2021", "09.3.2021", "09,04,2021", "09:05:2021",
+      "09 6 2021", "09-7-21", "09/08/21", "09.9.21", "09,10,21", "09:11:21",
+      # not yet working for strings with no separators, like "09102021", 
"091021",
+      NA
+    )
+  )
+
+  test_dates_locale <- tibble::tibble(
+    string_ymd = c(
+      "2021 Sep 12", "2021 September 13", "21 Sep 14", "21 September 15", NA
+    ),
+    string_dmy = c(
+      "12 Sep 2021", "13 September 2021", "14 Sep 21", "15 September 21", NA
+    ),
+    string_mdy = c(
+      "Sep 12 2021", "September 13 2021", "Sep 14 21", "September 15 21", NA
+    )
+  )
+
+  skip_if_not_available("re2")
+  compare_dplyr_binding(
+    .input %>%
+      mutate(
+        parsed_date_ymd = parse_date_time(string_ymd, orders = "ymd"),
+        parsed_date_dmy = parse_date_time(string_dmy, orders = "dmy"),
+        parsed_date_mdy = parse_date_time(string_mdy, orders = "mdy")
+      ) %>%
+      collect(),
+    test_dates
+  )
+
+  # locale not working properly on windows
+  # TODO revisit once https://issues.apache.org/jira/browse/ARROW-16399 is done
+  skip_on_os("windows")

Review Comment:
   > (...), so there actually are test cases that we should be asserting are ok 
in there.
   
   Not sure what you mean by this. We are testing the months as numbers 
separately and those tests are run for all platforms (except those that do not 
have `RE2`). The reason for splitting was exactly to avoid the situation where 
we skip too many tests (and skipping just those that are problematic, i.e. 
`"%b"` and `"%B"` formats for months)
   



##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1471,3 +1471,128 @@ test_that("make_difftime()", {
     )
   )
 })
+
+test_that("parse_date_time()", {
+  test_df <- tibble(
+    dates = c("09-01-17", "02-Sep-17")
+  )
+
+  test_dates <- tibble::tibble(
+    string_ymd = c(
+      "2021-09-1", "2021/09///2", "2021.09.03", "2021,09,4", "2021:09::5",
+      "2021 09   6", "21-09-07", "21/09/08", "21.09.9", "21,09,10", "21:09:11",
+      # not yet working for strings with no separators, like "20210917", 
"210918" or "2021Sep19
+      # no separators and %b or %B are even more complicated (and they work in
+      # lubridate). not to mention locale
+      NA
+    ),
+    string_dmy = c(
+      "1-09-2021", "2/09//2021", "03.09.2021", "04,09,2021", "5:::09:2021",
+      "6  09  2021", "07-09-21", "08/09/21", "9.09.21", "10,09,21", "11:09:21",
+      # not yet working for strings with no separators, like "10092021", 
"100921",
+      NA
+    ),
+    string_mdy = c(
+      "09-01-2021", "09/2/2021", "09.3.2021", "09,04,2021", "09:05:2021",
+      "09 6 2021", "09-7-21", "09/08/21", "09.9.21", "09,10,21", "09:11:21",
+      # not yet working for strings with no separators, like "09102021", 
"091021",
+      NA
+    )
+  )
+
+  test_dates_locale <- tibble::tibble(
+    string_ymd = c(
+      "2021 Sep 12", "2021 September 13", "21 Sep 14", "21 September 15", NA
+    ),
+    string_dmy = c(
+      "12 Sep 2021", "13 September 2021", "14 Sep 21", "15 September 21", NA
+    ),
+    string_mdy = c(
+      "Sep 12 2021", "September 13 2021", "Sep 14 21", "September 15 21", NA
+    )
+  )
+
+  skip_if_not_available("re2")
+  compare_dplyr_binding(
+    .input %>%
+      mutate(
+        parsed_date_ymd = parse_date_time(string_ymd, orders = "ymd"),
+        parsed_date_dmy = parse_date_time(string_dmy, orders = "dmy"),
+        parsed_date_mdy = parse_date_time(string_mdy, orders = "mdy")
+      ) %>%
+      collect(),
+    test_dates
+  )
+
+  # locale not working properly on windows
+  # TODO revisit once https://issues.apache.org/jira/browse/ARROW-16399 is done
+  skip_on_os("windows")

Review Comment:
   > (...), so there actually are test cases that we should be asserting are ok 
in there.
   
   Not sure what you mean by this. We **are** testing the months as numbers 
separately and those tests are run for all platforms (except those that do not 
have `RE2`). The reason for splitting was exactly to avoid the situation where 
we skip too many tests (and skipping just those that are problematic, i.e. 
`"%b"` and `"%B"` formats for months)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to