Jonathan Keane created ARROW-14200:
--------------------------------------

             Summary: [R] [C++] strftime on a date should not use or be 
confused by timezones
                 Key: ARROW-14200
                 URL: https://issues.apache.org/jira/browse/ARROW-14200
             Project: Apache Arrow
          Issue Type: New Feature
          Components: C++, R
            Reporter: Jonathan Keane


When the input to {{strftime}} is a date, timezones shouldn't be necessary or 
assumed.

What I think is going on below is the date 1992-01-01 is being interpreted as 
1992-01-01 00:00:00 in UTC, and then when {{strftime()}} is being called it's 
displaying that timestamp as 1991-12-31 ... (since my system is set to an after 
UTC timezone), and then taking the year out of it. If I specify {{tz = "utc"}} 
in the {{strftime()}}, I get the expected result (though that shouldn't be 
necessary).


Run in the US central timezone:
{code}
library(arrow, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)
Table$create(
  data.frame(
    x = as.Date("1992-01-01")
  )
) %>% 
  mutate(
    as_int_strftime = as.integer(strftime(x, "%Y")),
    strftime = strftime(x, "%Y"),
    as_int_strftime_utc = as.integer(strftime(x, "%Y", tz = "UTC")),
    strftime_utc = strftime(x, "%Y", tz = "UTC"),
    year = year(x)
  ) %>%
  collect()
#>            x as_int_strftime strftime as_int_strftime_utc strftime_utc year
#> 1 1992-01-01            1991     1991                1992         1992 1992
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to