jonkeane commented on code in PR #12589:
URL: https://github.com/apache/arrow/pull/12589#discussion_r861914674


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1471,3 +1471,128 @@ test_that("make_difftime()", {
     )
   )
 })
+
+test_that("parse_date_time()", {
+  test_df <- tibble(
+    dates = c("09-01-17", "02-Sep-17")
+  )
+
+  test_dates <- tibble::tibble(
+    string_ymd = c(
+      "2021-09-1", "2021/09///2", "2021.09.03", "2021,09,4", "2021:09::5",
+      "2021 09   6", "21-09-07", "21/09/08", "21.09.9", "21,09,10", "21:09:11",
+      # not yet working for strings with no separators, like "20210917", 
"210918" or "2021Sep19
+      # no separators and %b or %B are even more complicated (and they work in
+      # lubridate). not to mention locale
+      NA
+    ),
+    string_dmy = c(
+      "1-09-2021", "2/09//2021", "03.09.2021", "04,09,2021", "5:::09:2021",
+      "6  09  2021", "07-09-21", "08/09/21", "9.09.21", "10,09,21", "11:09:21",
+      # not yet working for strings with no separators, like "10092021", 
"100921",
+      NA
+    ),
+    string_mdy = c(
+      "09-01-2021", "09/2/2021", "09.3.2021", "09,04,2021", "09:05:2021",
+      "09 6 2021", "09-7-21", "09/08/21", "09.9.21", "09,10,21", "09:11:21",
+      # not yet working for strings with no separators, like "09102021", 
"091021",
+      NA
+    )
+  )
+
+  test_dates_locale <- tibble::tibble(
+    string_ymd = c(
+      "2021 Sep 12", "2021 September 13", "21 Sep 14", "21 September 15", NA
+    ),
+    string_dmy = c(
+      "12 Sep 2021", "13 September 2021", "14 Sep 21", "15 September 21", NA
+    ),
+    string_mdy = c(
+      "Sep 12 2021", "September 13 2021", "Sep 14 21", "September 15 21", NA
+    )
+  )
+
+  skip_if_not_available("re2")
+  compare_dplyr_binding(
+    .input %>%
+      mutate(
+        parsed_date_ymd = parse_date_time(string_ymd, orders = "ymd"),
+        parsed_date_dmy = parse_date_time(string_dmy, orders = "dmy"),
+        parsed_date_mdy = parse_date_time(string_mdy, orders = "mdy")
+      ) %>%
+      collect(),
+    test_dates
+  )
+
+  # locale not working properly on windows
+  # TODO revisit once https://issues.apache.org/jira/browse/ARROW-16399 is done
+  skip_on_os("windows")

Review Comment:
   > I assumed we set the locale to "C" globally, don't we?
   
   Yeah, you're right globally for this file at least, so that should be good
   
   > Those tests will simply return NA on Windows, resulting in a difference to 
expected.
   
   That seems not ideal. Even if we accept that dates with months written out 
in them will not work on windows, Looking at those failures: they are happening 
only with text months, so there actually are test cases that we should be 
asserting are ok in there.
   
   But I'm still a bit surprised that we're seeing this difference at all since 
we set the locale at the top. If I'm reading the issue correctly, that's about 
locales other than C not working (on windows). Is there a different issue about 
even the C locale + strptime not working with `%b` and the like? If I'm reading 
the strftime tests we do confirm that the opposite direction works...
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to