Re: [Rd] New pipe operator

Duncan Murdoch Wed, 09 Dec 2020 08:09:18 -0800

On 09/12/2020 10:42 a.m., Jan van der Laan wrote:





On 09-12-2020 16:20, Duncan Murdoch wrote:

On 09/12/2020 9:55 a.m., Jan van der Laan wrote:


I think only allowing functions on the right hand side (e.g. only the |>
operator and not the |:>) would be enough to handle most cases and seems
easier to reason about. The limitations of that can easily be worked
around using existing functionality in the language.


I agree that would be sufficient, but I don't see how it makes reasoning
easier.  The transformation is trivial, so I'll assume that doesn't
consume any mental energy compared to understanding what the final
expression actually does.  Using your currying example, the choice is
between

   x |> mean(na.rm = TRUE)

which transforms to mean(x, na.rm = TRUE), or your proposed

   x |> curry(mean, na.rm = TRUE)

which transforms to

   curry(mean, na.rm = TRUE)(x)

To me curry(mean, na.rm = TRUE)(x) looks a lot more complicated than
mean(x, na.rm = TRUE), especially since it has the additional risk that
users can define their own function called "curry".



First, I do agree that

x |> mean(na.rm = TRUE)

is cleaner and this covers most of the use cases of users and many users
are used to the syntax from the magritr pipes.

However, for programmers (there is not distinct line between users and
programmers), it is simpler to reason in the sense that lhs |> rhs
always mean rhs(lhs); this does not depend on whether rhs is call or
(anonymous) function (not sure what is called what; which perhaps
illustrates the difficulty).


I think your proposed rule is pretty simple, with just one case:

lhs |> rhs

would transform to rhs(lhs).  Yes, that's simple.

The current rule is not as simple as yours, but it only has two casesinstead of 1. Both involve the rhs being a call, nothing else.

Case 1, the common one: rhs is a call to a function using regularsyntax, e.g. f(args) where args might be empty. Then it is transformedto f(lhs, args).

Case 2: rhs is a call to `function`, which we normally write as"function(args) body", which is transformed to (function(args) body)(lhs).

That's it! Nothing else is allowed. Not as simple as yours, but simpleenough to be trivial to reason about. Most of the effort would be spentin figuring out how the transformed expression would evaluate, and sinceyour transformed expression is more complicated in the common case wherecurrying is needed, I prefer the current proposal.


As soon as you start to have functions returning functions, you have to
think about how many brackets you have to place where. Being able to use
functions returning functions does open up possibilities for
programmers, as illustrated for example in my example using expressions.
This would have been much less clear.

I think your examples would work in the current system, too, with asmall change to fexpr. A corresponding change to curry could be made,but then it wouldn't be doing currying, so I won't do that. Here's yourexample rewritten in the R-devel system:


fexpr <- function(x, expr){
  expr <- substitute(expr)
  f <- function(.) {}
  body(f) <- expr
  f(x)
}
. <- fexpr


1:10 |> mean()
c(1,3,NA) |> mean(na.rm = TRUE)
c(1,3,NA) |> .( mean(., na.rm = TRUE) ) |> identity()
c(1,3,NA) |> .( . + 4)
c(1,3,NA) |> fexpr( . + 4)
c(1,3,NA) |> function(x) mean(x, na.rm = TRUE) |> fexpr(. + 1)

That produces the same outputs as your code.

Duncan Murdoch

The argument of users begin able to redefine curry. Yes they can and
this is perhaps a good thing. They can also redefine a lot of other
stuff. And I am not suggesting that curry or fexpr or . are good names.
You could even have a curry operator.

Best,
Jan


Duncan Murdoch


The problem with only allowing

x |> mean

and not

x |> mean()

is with additional arguments. However, this can be solved with a
currying function, for example:

x |> curry(mean, na.rm = TRUE)

The cost is a few additional characters.

In the same way it is possible to write a function that accepts an
expression and returns a function containing that expression. This can
be used to have expressions on the right-hand side and reduces the need
for anonymous functions.

x |> fexpr(. + 10)
dta |> fexpr(lm(y ~ x, data = .))

You could call this function .:

x |> .(. + 10)
dta |> .(lm(y ~ x, data = .))


Dummy example code (thanks to  a colleague of mine)


fexpr <- function(expr){
     expr <- substitute(expr)
     f <- function(.) {}
     body(f) <- expr
     f
}
. <- fexpr

curry <- function(fun,...){
     L <- list(...)
     function(...){
       do.call(fun, c(list(...),L))
     }
}

`%|>%` <- function(e1, e2) {
     e2(e1)
}


1:10 %>% mean
c(1,3,NA) %|>% curry(mean, na.rm = TRUE)
c(1,3,NA) %|>% .( mean(., na.rm = TRUE) ) %>% identity
c(1,3,NA) %|>% .( . + 4)
c(1,3,NA) %|>% fexpr( . + 4)
c(1,3,NA) %|>% function(x) mean(x, na.rm = TRUE) %>% fexpr(. + 1)

--
Jan

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

Reply via email to