jonkeane commented on a change in pull request #12433: URL: https://github.com/apache/arrow/pull/12433#discussion_r817900666
########## File path: r/R/dplyr-funcs-type.R ########## @@ -76,6 +76,60 @@ register_bindings_type_cast <- function() { register_binding("as.numeric", function(x) { Expression$create("cast", x, options = cast_options(to_type = float64())) }) + register_binding("as.Date", function(x, + format = NULL, + tryFormats = "%Y-%m-%d", + origin = "1970-01-01", + tz = "UTC") { + + if (call_binding("is.Date", x)) { + # base::as.Date() first converts to the desired timezone and then extracts + # the date, which is why we need to go through timestamp() first + return(x) + + # cast from POSIXct + } else if (call_binding("is.POSIXct", x)) { + if (tz == "UTC") { + x <- build_expr("cast", x, options = cast_options(to_type = timestamp(timezone = tz))) + } else { + abort("`as.Date()` with a timezone different to 'UTC' is not supported in Arrow") + } Review comment: My point wasn't about where the arg check was happening here, my point was: I don't think you need to do the arg check at all. I believe we have all the machinery in Arrow to support timezones other than UTC both in the input and as an argument here. This is a more explicit version of above, but the same general principle: ``` r library(arrow, warn.conflicts = FALSE) library(dplyr, warn.conflicts = FALSE) df <- data.frame(time = as.POSIXct("2020-01-01 23:30:00", tz = "America/Chicago")) df %>% mutate( as_date = as.Date(time), as_date_nyc = as.Date(time, tz = "America/New_York"), as_date_chi = as.Date(time, tz = "America/Chicago"), as_date_lax = as.Date(time, tz = "America/Los_Angeles") ) %>% collect() #> time as_date as_date_nyc as_date_chi as_date_lax #> 1 2020-01-01 23:30:00 2020-01-02 2020-01-02 2020-01-01 2020-01-01 df %>% arrow_table() %>% mutate( as_date = cast(cast(time, timestamp(timezone = "UTC")), date32()), as_date_nyc = cast(cast(time, timestamp(timezone = "America/New_York")), date32()), as_date_chi = cast(cast(time, timestamp(timezone = "America/Chicago")), date32()), as_date_lax = cast(cast(time, timestamp(timezone = "America/Los_Angeles")), date32()) ) %>% collect() #> time as_date as_date_nyc as_date_chi as_date_lax #> 1 2020-01-01 23:30:00 2020-01-02 2020-01-02 2020-01-01 2020-01-01 ``` Am I missing something here? Or misunderstanding the issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org