jonkeane commented on code in PR #12738:
URL: https://github.com/apache/arrow/pull/12738#discussion_r854329903


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -372,3 +372,50 @@ binding_format_datetime <- function(x, format = "", tz = 
"", usetz = FALSE) {
 
   build_expr("strftime", x, options = list(format = format, locale = 
Sys.getlocale("LC_TIME")))
 }
+
+binding_as_date <- function(x,
+                            format = NULL,
+                            tryFormats = "%Y-%m-%d",
+                            origin = "1970-01-01",
+                            tz = "UTC",
+                            base = TRUE) {
+
+  if (is.null(format) && length(tryFormats) > 1) {
+    abort("`as.Date()` with multiple `tryFormats` is not supported in Arrow")
+  }
+
+  if (call_binding("is.Date", x)) {
+    return(x)
+
+    # cast from POSIXct
+  } else if (call_binding("is.POSIXct", x)) {
+    # base::as.Date() first converts to the desired timezone and then extracts
+    # the date, which is why we need to go through timestamp() first
+    if (base || !is.null(tz)) {
+      x <- build_expr("cast", x, options = cast_options(to_type = 
timestamp(timezone = tz)))
+    }
+    # POSIXct is of type double -> we need this to prevent going down the
+    # "double" branch
+    x <- x
+
+    # cast from character
+  } else if (call_binding("is.character", x)) {
+    format <- format %||% tryFormats[[1]]
+    # unit = 0L is the identifier for seconds in valid_time32_units
+    x <- build_expr("strptime", x, options = list(format = format, unit = 0L))
+
+    # cast from numeric
+  } else if (call_binding("is.numeric", x) &
+             (!call_binding("is.integer", x) | origin != "1970-01-01")) {
+    # Arrow does not support direct casting from double to date32(), but for
+    # integer-like values we can go via int32()
+    # https://issues.apache.org/jira/browse/ARROW-15798
+    # TODO revisit if arrow decides to support double -> date casting
+    x <- build_expr("cast", x, options = cast_options(to_type = int32()))
+    delta_in_sec <- call_binding("difftime", origin, "1970-01-01")
+    delta_in_sec <- build_expr("cast", delta_in_sec, options = 
cast_options(to_type = int64()))
+    delta_in_days <- (delta_in_sec / 86400L)$cast(int32())
+    x <- build_expr("+", x, delta_in_days)

Review Comment:
   > I think my thinking (for conflating numeric with origin) was that the 
origin argument only makes sense when x is numeric. If we had separate methods, 
then origin would only be an arg for the numeric method.
   
   Right, that totally makes sense that origin is only relevant for when `x` is 
numeric. My comment was mostly about how hard it is to reason through that 
if/else clause to figure out that was what was doing on. I do really think this 
chunk of code would be easier to reason about if we broke it up and did each 
type conversion on its own. We don't really need or want to do full-on method 
dispatch, but if we broke up each piece as a separate function it'll make 
what's going on in each + why much easier to follow.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to