bkietz commented on a change in pull request #10751:
URL: https://github.com/apache/arrow/pull/10751#discussion_r673491621
##########
File path: r/R/dplyr-functions.R
##########
@@ -57,6 +57,46 @@ nse_funcs$cast <- function(x, target_type, safe = TRUE, ...)
{
Expression$create("cast", x, options = opts)
}
+nse_funcs$coalesce <- function(...) {
+ args <- list2(...)
+ if (length(args) < 1) {
+ abort("At least one argument must be supplied to coalesce()")
+ }
+
+ # Treat NaN like NA for consistency with dplyr::coalesce():
+ # if *all* the values are NaN, we should return NaN, not NA, so don't replace
+ # NaN with NA in the final (or only) argument
+ # TODO: if an option is added to the coalesce kernel to treat NaN as NA,
+ # use that to simplify the code here (ARROW-13389)
+ attr(args[[length(args)]], "last") <- TRUE
+ args <- lapply(args, function(arg) {
+ last_arg <- is.null(attr(arg, "last"))
+ attr(arg, "last") <- NULL
+
+ if (!inherits(arg, "Expression")) {
+ arg <- Expression$scalar(arg)
+ }
+
+ # coalesce doesn't yet support factors/dictionaries
+ # TODO: remove this after ARROW-13390 is merged
+ if (nse_funcs$is.factor(arg)) {
+ warning("Dictionaries (in R: factors) are currently converted to strings
(characters) in coalesce", call. = FALSE)
+ }
+
+ if (last_arg && arg$type_id() %in% TYPES_WITH_NAN) {
+ # store the NA_real_ in Arrow's smallest float type to avoid casting
+ # smaller float types to larger float types
+ NA_expr <- Expression$scalar(Scalar$create(NA_real_, type = arg$type()))
+ # TODO: Figure out why this doesn't work:
+ #Expression$create("replace_with_mask", arg, Expression$create("is_nan",
arg), NA_expr)
Review comment:
```suggestion
# TODO: Figure out why this doesn't work:
#Expression$create("replace_with_mask", arg,
Expression$create("is_nan", arg), NA_expr)
```
replace_with_mask is not a scalar function and cannot be used in an
Expression. (In general, any function which accepts arguments with differing
lengths is not a scalar function.) if_else should be fully equivalent for this
use case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]