dragosmg commented on a change in pull request #12433:
URL: https://github.com/apache/arrow/pull/12433#discussion_r817549370



##########
File path: r/R/dplyr-funcs-type.R
##########
@@ -76,6 +76,60 @@ register_bindings_type_cast <- function() {
   register_binding("as.numeric", function(x) {
     Expression$create("cast", x, options = cast_options(to_type = float64()))
   })
+  register_binding("as.Date", function(x,
+                                       format = NULL,
+                                       tryFormats = "%Y-%m-%d",
+                                       origin = "1970-01-01",
+                                       tz = "UTC") {
+
+    if (call_binding("is.Date", x)) {
+      # base::as.Date() first converts to the desired timezone and then 
extracts
+      # the date, which is why we need to go through timestamp() first
+      return(x)
+
+    # cast from POSIXct
+    } else if (call_binding("is.POSIXct", x)) {
+      if (tz == "UTC") {
+        x <- build_expr("cast", x, options = cast_options(to_type = 
timestamp(timezone = tz)))
+      } else {
+        abort("`as.Date()` with a timezone different to 'UTC' is not supported 
in Arrow")
+      }
+
+    # cast from character
+    } else if (call_binding("is.character", x)) {
+      # this could be improved with tryFormats once strptime returns NA and we
+      # can use coalesce - https://issues.apache.org/jira/browse/ARROW-15659
+      # TODO revisit once https://issues.apache.org/jira/browse/ARROW-15659 is 
done
+      if (is.null(format)) {
+        if (length(tryFormats) == 1) {
+          format <- tryFormats[1]
+        } else {
+          abort("`as.Date()` with multiple `tryFormats` is not supported in 
Arrow yet")
+        }
+      }
+      x <- build_expr("strptime", x, options = list(format = format, unit = 
0L))
+
+    # cast from numeric
+    } else if (call_binding("is.numeric", x)) {
+      # the origin argument will be better supported once we implement temporal
+      # arithmetic (https://issues.apache.org/jira/browse/ARROW-14947)
+      # TODO revisit once the above has been sorted
+      if (!call_binding("is.integer", x)) {
+        # Arrow does not support direct casting from double to date so we have
+        # to convert to integers first - casting to int32() would error so we
+        # need to use `floor` before casting. `floor` is also a bit safer than
+        # int32() with `safe = FALSE` since it doesn't switch to `ceiling` for
+        # negative numbers
+        # TODO revisit if arrow decides to support double -> date casting
+        x <- call_binding("floor", x)
+        x <- build_expr("cast", x, options = cast_options(to_type = int32()))
+      }
+      if (origin != "1970-01-01") {
+        abort("`as.Date()` with an `origin` different than '1970-01-01' is not 
supported in Arrow")
+      }
+    }

Review comment:
       While I generally agree that argument validation should happen at the 
top of a function. I'm on the fence on this one. The origin argument is only 
relevant when trying to coerce a numeric, so I'm a bit reluctant to move it 
outside the `numeric` block. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to