paleolimbot commented on code in PR #13397:
URL: https://github.com/apache/arrow/pull/13397#discussion_r917933784


##########
r/R/compute.R:
##########
@@ -307,3 +307,157 @@ cast_options <- function(safe = TRUE, ...) {
   )
   modifyList(opts, list(...))
 }
+
+#' Register user-defined functions
+#'
+#' These functions support calling R code from query engine execution
+#' (i.e., a [dplyr::mutate()] or [dplyr::filter()] on a [Table] or [Dataset]).
+#' Use [arrow_scalar_function()] to define an R function that accepts and
+#' returns R objects; use [arrow_advanced_scalar_function()] to define a
+#' lower-level function that operates directly on Arrow objects.
+#'
+#' @param name The function name to be used in the dplyr bindings
+#' @param scalar_function An object created with [arrow_scalar_function()]
+#'   or [arrow_advanced_scalar_function()].
+#' @param in_type A [DataType] of the input type or a [schema()]
+#'   for functions with more than one argument. This signature will be used
+#'   to determine if this function is appropriate for a given set of arguments.
+#'   If this function is appropriate for more than one signature, pass a
+#'   `list()` of the above.
+#' @param out_type A [DataType] of the output type or a function accepting
+#'   a single argument (`types`), which is a `list()` of [DataType]s. If a
+#'   function it must return a [DataType].
+#' @param fun An R function or rlang-style lambda expression. This function
+#'   will be called with R objects as arguments and must return an object
+#'   that can be converted to an [Array] using [as_arrow_array()]. Function
+#'   authors must take care to return an array castable to the output data
+#'   type specified by `out_type`.
+#' @param advanced_fun An R function or rlang-style lambda expression. This
+#'   function will be called with exactly two arguments: `kernel_context`,
+#'   which is a `list()` of objects giving information about the
+#'   execution context and `args`, which is a list of [Array] or [Scalar]
+#'   objects corresponding to the input arguments.
+#'
+#' @return
+#'   - `register_user_defined_function()`: `NULL`, invisibly
+#'   - `arrow_scalar_function()`: returns an object of class
+#'     "arrow_advanced_scalar_function" that can be passed to
+#'     `register_user_defined_function()`.
+#' @export
+#'
+#' @examplesIf .Machine$sizeof.pointer >= 8
+#' fun_wrapper <- arrow_scalar_function(
+#'   function(x, y, z) x + y + z,
+#'   schema(x = float64(), y = float64(), z = float64()),
+#'   float64()
+#' )
+#' register_user_defined_function(fun_wrapper, "example_add3")
+#'
+#' call_function(
+#'   "example_add3",
+#'   Scalar$create(1),
+#'   Scalar$create(2),
+#'   Array$create(3)
+#' )
+#'
+#' # use arrow_advanced_scalar_function() for a lower-level interface
+#' advanced_fun_wrapper <- arrow_advanced_scalar_function(
+#'   function(context, args) {

Review Comment:
   Mostly just easier to construct the call from C++ with a fixed number of 
arguments...the version created with `arrow_scalar_function()` uses normal 
arguments for the user-provided function (I'm expecting this is what most 
people will use). A list is also a tiny bit easier to program around than R's 
variable argument `...` behaviour...I could put some more effort in to the C++ 
call, though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to