> > On 06/12/2020 8:22 p.m., Bravington, Mark (Data61, Hobart) wrote:
> (and Duncan Murdoch responded, as below)

It still seems to me that placeholders are viable and unambiguous (only as 
things in RHS of pipes), and that something like  

x |> foo( _PIPE_) 
x |> bah( otherarg, _PIPE_)
x |> { y <- _PIPE_+1; _PIPE_ / y } # anonymous function

would be a viable solution that doesn't break syntax. This suggestion expands 
slightly on my previous post, to deal with anonymous functions without either 
ugly or cumbersome syntax. Specifically, the parser could expand curly-braces 
in a pipe RHS like this:

x |> {expr} 

into

_ANON_( _PIPE_, {expr})( x)

where '_ANON_' itself  constructs-and-calls a function with one argument called 
'_PIPE_' whose body consists of 'expr' (which presumably refers to '_PIPE_')

No new weird operators like \(, and the only break with normal syntax is the 
expansion of curly-braces--- plus treating one symbol, _PIPE_ or any other 
pre-agreed one perhaps even just _, as a legal name inside a pipe.

Now back to my previous post and responding to Duncan's comments:

> >   - I don't see the problem with a placeholder--- doesn't it remove all 
> > ambiguity? Sure there needs to be a standard unclashable name and people 
> > can argue about what that should be, but the following seems clear and 
> > flexible... to me, anyway:
> >   
> >   thing |>
> >     foo( _PIPE_) |>           # standard
> >     bah( arg1, _PIPE_) |>   # multi-arg function
> >     _ANON_({ x <- sum( _PIPE_); _PIPE_/x + x/_PIPE_ })   # anon function
> >    
> > where '_PIPE_' is the ordained name of the placeholder, and '_ANON_' 
> > constructs-and-calls a function with single argument '_PIPE_'. There is 
> > just one rule (I think...): each pipe-stage must be a *call* involving the 
> > argument '_PIPE_'.

and (see end) I think the line with _ANON_ above could in fact just be

  ... |> { x <- sum( _PIPE_); _PIPE_/x + x/_PIPE_ }
  
> I believe there's no ambiguity if the placeholder is *only* allowed in the 
> RHS of a pipe expression.

Yes--- that's my suggestion: there and only there. Hence an otherwise-illegal 
name like _PIPE_.

> I think the ambiguity arises if you allow 
> the same syntax to be used to generate anonymous functions.

Agreed; don't allow that outside of pipes.

> We can't use _PIPE_ as the placeholder, because it's a legal name.

Surely _PIPE_ is not a legal R name though? Not in my R4.1.0dev, anyway. Of 
course, the parser can _make_ it legal *only* in the context of RHS of pipe. 
Anyway, I'm not suggesting the names must be _PIPE_ and _ANON_--- the name is 
secondary to the concept.

> But we could use _.  

Yes, any unclashable name would do--- though "_" is hard to read (as per 
complaints about "." as placeholder, which I use in my own code). And I 
remember back when R decided to disallow "_" as an assignment operator. Which 
forced me to change about 50000 lines of code that wasn't stored in text files, 
so I wasn't happy--- and the only reason that was ever given to me, was "it's 
ugly" ! Which is kinda true.

> However, implementing this makes the parser pretty ugly: its handling of _ 
> depends on the outer context.  

That context issue is already the case under the current proposal though; eg 
'head(10)' in 

x |> head(10) # really head( <arg>, 10)

or indeed 'head' in 

x |> head # actually a call to head(<arg>), not a symbol


> I now agree that leaving out placeholder syntax was the right decision.

Others are not so sure!

Re operator precedence: thanks for pointing that out, I accept that R's current 
operator-precedence rules make it impossible for people to cleanly implement 
their own preferred homemade %my_pipe_op%.

All the more reason to be careful with the pipe syntax choice for base-R, of 
course...


> Then

>    x |> (_ + 1) + mean(_)

> could expand unambiguously to

>    (function(_) (_  + 1) + mean(_))(x)

> but

>    (_ + 1) + mean(_)

> shouldn't be taken to be an anonymous function declaration, otherwise 
> things like

>    mean(_ |> _)

> do become ambiguous:  is the second placeholder the argument to the anon 
> function, or is it the placeholder for the embedded pipe?

That wouldn't be allowed in my proposal; the "_ |> _" is illegal because the 
RHS is not a call. For the whole thing, I'd require

x |> _ANON_((_PIPE_ + 1) + mean( _PIPE_))

or the just-curly version

x |> { (_PIPE_ + 1) + mean( _PIPE_)} 

but with the implication that this would parse out to 

(`_ANON_`( { (_PIPE_ + 1) + mean( _PIPE_)}))( x)

> > [*] Definition of _ANON_ could be something like this--- almost certainly 
> > won't work as-is, this is just to point out that it could be done in 
> > standard R.

> > `_ANON_` <- function( expr) { 
> >   #1. Construct a function with arg '_PIPE_' and body 'expr'
> >   #2. Construct a call() to that function
> >   #3. Do the call

> >   f <- function( `_PIPE_`) NULL
> >   body( f) <- expr
> >   environment( f) <- parent.frame() # or something... yes these details are 
> > almost certainly wrong
> >   expr2 <- substitute( f( `_PIPE_`)) # or something...
> >   eval.parent( expr2) # or something... 
> > }


Mark Bravington
CSIRO Marine Lab
Hobart
Australia


________________________________________
From: Duncan Murdoch <murdoch.dun...@gmail.com>
Sent: Monday, 7 December 2020 21:31
To: Bravington, Mark (Data61, Hobart); Gabor Grothendieck; Gabriel Becker
Cc: r-devel@r-project.org
Subject: Re: [Rd] New pipe operator

On 06/12/2020 8:22 p.m., Bravington, Mark (Data61, Hobart) wrote:
> Seems like this *could* be a good thing, and thanks to R core for considering 
> it. But, FWIW:
>
>   - I agree with Gabor G that consistency of "syntax" should be paramount 
> here. Enough problems have been caused by earlier superficially-convenient 
> non-standard features in R.  In particular:
>
>   -- there should not be any discrepancy between an in-place 
> function-definition, and a predefined function attached to a symbol (as per 
> Gabor's point).
>
>   -- Hence, the ability to say x |> foo  ie without parentheses, seems bound 
> to lead to inconsistency, because x |> foo is allowed, x |> base::foo isn't 
> allowed without tricks, but x |> function( y) foo( y) isn't... So, x |> foo 
> is not worth keeping. Parentheses are a price well worth paying.
>
>   -- it is still inconsistent and confusing to (apparently) invoke a function 
> in some places--- normally--- via 'foo(x)', yet in others--- pipily--- via 
> 'foo()'. Especially if 'foo' already has a default value for its first 
> argument.
>
>   - I don't see the problem with a placeholder--- doesn't it remove all 
> ambiguity? Sure there needs to be a standard unclashable name and people can 
> argue about what that should be, but the following seems clear and 
> flexible... to me, anyway:
>
>   thing |>
>     foo( _PIPE_) |>           # standard
>     bah( arg1, _PIPE_) |>   # multi-arg function
>     _ANON_({ x <- sum( _PIPE_); _PIPE_/x + x/_PIPE_ })   # anon function
>
> where '_PIPE_' is the ordained name of the placeholder, and '_ANON_' 
> constructs-and-calls a function with single argument '_PIPE_'. There is just 
> one rule (I think...): each pipe-stage must be a *call* involving the 
> argument '_PIPE_'.

I believe there's no ambiguity if the placeholder is *only* allowed in
the RHS of a pipe expression.  I think the ambiguity arises if you allow
the same syntax to be used to generate anonymous functions.  We can't
use _PIPE_ as the placeholder, because it's a legal name.  But we could
use _.  Then

   x |> (_ + 1) + mean(_)

could expand unambiguously to

   (function(_) (_  + 1) + mean(_))(x)

but

   (_ + 1) + mean(_)

shouldn't be taken to be an anonymous function declaration, otherwise
things like

   mean(_ |> _)

do become ambiguous:  is the second placeholder the argument to the anon
function, or is it the placeholder for the embedded pipe?

However, implementing this makes the parser pretty ugly:  its handling
of _ depends on the outer context.  I now agree that leaving out
placeholder syntax was the right decision.


>
>
>   - The proposed anonymous-function syntax looks quite ugly to me, 
> diminishing readability and inviting errors. The new pipe symbol |> already 
> looks scarily like quantum mechanics; adding \( just puts fishbones into the 
> symbolic soup.
>
>   - IMO it's not worth going too far to try to lure magritter-etc fans to 
> swap to the new; my experience is that many people keep using older inferior 
> R syntax for years after better replacements become available (even if they 
> are aware of replacements), for various reasons. Just provide a good 
> framework, and let nature take its course.
>
>   - Disclaimer: personally I'm not much of a pipehead anyway, so maybe I'm 
> not the audience. But if I was to consider piping, I wouldn't be very tempted 
> by the current proposal. OTOH, I might even be tempted to write--- and 
> use!--- my own version of '%|>%' as above (maybe someone already has). And if 
> R did it for me, that'd be great :)

Yours would suffer one of the same problems as magrittr's:  it has the
wrong operator precedence.  The current precedence ordering (from
?Syntax) is, from highest to lowest:


:: :::  access variables in a namespace
$ @     component / slot extraction
[ [[    indexing
^       exponentiation (right to left)
- +     unary minus and plus
:       sequence operator
%any%   special operators (including %% and %/%)
* /     multiply, divide
+ -     (binary) add, subtract
< > <= >= == != ordering and comparison
!       negation
& &&    and
| ||    or
~       as in formulae
-> ->>  rightwards assignment
<- <<-  assignment (right to left)
=       assignment (right to left)
?       help (unary and binary)


The %>% operator has higher precedence than the arithmetic operators, so

x*y %>% f()

is equivalent to x*f(y), not

f(x*y)

as it should "obviously" be.  I believe the new |> operator falls
between "| ||" and "~", so

x || y |> f()

is the same as f(x || y), and

x ~ y |> f()

is x ~ f(y).   There could be arguments about where the new one appears
(and there probably have been), but *clearly* magrittr's precedence is
wrong, and yours would be too, because they are both fixed at the quite
high precedence given to %any%.

Duncan Murdoch

>
> [*] Definition of _ANON_ could be something like this--- almost certainly 
> won't work as-is, this is just to point out that it could be done in standard 
> R.
>
> `_ANON_` <- function( expr) {
>    #1. Construct a function with arg '_PIPE_' and body 'expr'
>    #2. Construct a call() to that function
>    #3. Do the call
>
>    f <- function( `_PIPE_`) NULL
>    body( f) <- expr
>    environment( f) <- parent.frame() # or something... yes these details are 
> almost certainly wrong
>    expr2 <- substitute( f( `_PIPE_`)) # or something...
>    eval.parent( expr2) # or something...
> }
>
> cheers
> Mark
>
> Mark Bravington
> CSIRO Marine Lab
> Hobart
> Australia
>
>
> ________________________________________
> From: R-devel <r-devel-boun...@r-project.org> on behalf of Gabor Grothendieck 
> <ggrothendi...@gmail.com>
> Sent: Monday, 7 December 2020 10:21
> To: Gabriel Becker
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] New pipe operator
>
> I understand very well that it is implemented at the syntax level;
> however, in any case the implementation is irrelevant to the principles.
>
> Here a similar example to the one I gave before but this time written out:
>
> This works:
>
>    3 |> function(x) x + 1
>
> but this does not:
>
>    foo <- function(x) x + 1
>    3 |> foo
>
> so it breaks the principle of functions being first class objects.  foo and 
> its
> definition are not interchangeable.  You have
> to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)().
>
> This isn't just a matter of notation, i.e. foo vs foo(), but is a
> matter of breaking
> the way R works as a functional language with first class functions.
>
> On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker <gabembec...@gmail.com> wrote:
>>
>> Hi Gabor,
>>
>> On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck <ggrothendi...@gmail.com> 
>> wrote:
>>>
>>> I think the real issue here is that functions are supposed to be
>>> first class objects in R
>>> or are supposed to be and |> would break that if if is possible
>>> to write function(x) x + 1 on the RHS but not foo (assuming foo
>>> was defined as that function).
>>>
>>> I don't think getting experience with using it can change that
>>> inconsistency which seems serious to me and needs to
>>> be addressed even if it complicates the implementation
>>> since it drives to the heart of what R is.
>>>
>>
>> With respect I think this is a misunderstanding of what is happening here.
>>
>> Functions are first class citizens. |> is, for all intents and purposes, a 
>> macro.
>>
>> LHS |> RHS(arg2=5)
>>
>> parses to
>>
>> RHS(LHS, arg2 = 5)
>>
>> There are no functions at the point in time when the pipe transformation 
>> happens, because no code has been evaluated. To know if a symbol is going to 
>> evaluate to a function requires evaluation which is a step entirely after 
>> the one where the |> pipe is implemented.
>>
>> Another way to think about it is that
>>
>> LHS |> RHS(arg2 = 5)
>>
>> is another way of writing RHS(LHS, arg2 = 5), NOT R code that is (or even 
>> can be) evaluated.
>>
>>
>> Now this is a subtle point that only really has implications in as much as 
>> it is not the case for magrittr pipes, but its relevant for discussions like 
>> this, I think.
>>
>> ~G
>>
>>> On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
>>> <ggrothendi...@gmail.com> wrote:
>>>>
>>>> The construct utils::head  is not that common but bare functions are
>>>> very common and to make it harder to use the common case so that
>>>> the uncommon case is slightly easier is not desirable.
>>>>
>>>> Also it is trivial to write this which does work:
>>>>
>>>> mtcars %>% (utils::head)
>>>>
>>>> On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage <hugh.parson...@gmail.com> 
>>>> wrote:
>>>>>
>>>>> I'm surprised by the aversion to
>>>>>
>>>>> mtcars |> nrow
>>>>>
>>>>> over
>>>>>
>>>>> mtcars |> nrow()
>>>>>
>>>>> and I think the decision to disallow the former should be
>>>>> reconsidered.  The pipe operator is only going to be used when the rhs
>>>>> is a function, so there is no ambiguity with omitting the parentheses.
>>>>> If it's disallowed, it becomes inconsistent with other treatments like
>>>>> sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
>>>>> noise.  I'm not sure why this decision was taken
>>>>>
>>>>> If the only issue is with the double (and triple) colon operator, then
>>>>> ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
>>>>> -- in other words, demote the precedence of |>
>>>>>
>>>>> Obviously (looking at the R-Syntax branch) this decision was
>>>>> considered, put into place, then dropped, but I can't see why
>>>>> precisely.
>>>>>
>>>>> Best,
>>>>>
>>>>>
>>>>> Hugh.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar <deepayan.sar...@gmail.com> 
>>>>> wrote:
>>>>>>
>>>>>> On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch <murdoch.dun...@gmail.com> 
>>>>>> wrote:
>>>>>>>
>>>>>>> On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
>>>>>>>>>    Error: function '::' not supported in RHS call of a pipe
>>>>>>>>
>>>>>>>> To me, this error looks much more friendly than magrittr's error.
>>>>>>>> Some of them got too used to specify functions without (). This
>>>>>>>> is OK until they use `::`, but when they need to use it, it takes
>>>>>>>> hours to figure out why
>>>>>>>>
>>>>>>>> mtcars %>% base::head
>>>>>>>> #> Error in .::base : unused argument (head)
>>>>>>>>
>>>>>>>> won't work but
>>>>>>>>
>>>>>>>> mtcars %>% head
>>>>>>>>
>>>>>>>> works. I think this is a too harsh lesson for ordinary R users to
>>>>>>>> learn `::` is a function. I've been wanting for magrittr to drop the
>>>>>>>> support for a function name without () to avoid this confusion,
>>>>>>>> so I would very much welcome the new pipe operator's behavior.
>>>>>>>> Thank you all the developers who implemented this!
>>>>>>>
>>>>>>> I agree, it's an improvement on the corresponding magrittr error.
>>>>>>>
>>>>>>> I think the semantics of not evaluating the RHS, but treating the pipe
>>>>>>> as purely syntactical is a good decision.
>>>>>>>
>>>>>>> I'm not sure I like the recommended way to pipe into a particular 
>>>>>>> argument:
>>>>>>>
>>>>>>>     mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
>>>>>>>
>>>>>>> or
>>>>>>>
>>>>>>>     mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
>>>>>>>
>>>>>>> both of which are equivalent to
>>>>>>>
>>>>>>>     mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = 
>>>>>>> d))()
>>>>>>>
>>>>>>> It's tempting to suggest it should allow something like
>>>>>>>
>>>>>>>     mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
>>>>>>
>>>>>> Which is really not that far off from
>>>>>>
>>>>>> mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
>>>>>>
>>>>>> once you get used to it.
>>>>>>
>>>>>> One consequence of the implementation is that it's not clear how
>>>>>> multiple occurrences of the placeholder would be interpreted. With
>>>>>> magrittr,
>>>>>>
>>>>>> sort(runif(10)) %>% ecdf(.)(.)
>>>>>> ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
>>>>>>
>>>>>> This is probably what you would expect, if you expect it to work at all, 
>>>>>> and not
>>>>>>
>>>>>> ecdf(sort(runif(10)))(sort(runif(10)))
>>>>>>
>>>>>> There would be no such ambiguity with anonymous functions
>>>>>>
>>>>>> sort(runif(10)) |> \(.) ecdf(.)(.)
>>>>>>
>>>>>> -Deepayan
>>>>>>
>>>>>>> which would be expanded to something equivalent to the other versions:
>>>>>>> but that makes it quite a bit more complicated.  (Maybe _ or \. should
>>>>>>> be used instead of ., since those are not legal variable names.)
>>>>>>>
>>>>>>> I don't think there should be an attempt to copy magrittr's special
>>>>>>> casing of how . is used in determining whether to also include the
>>>>>>> previous value as first argument.
>>>>>>>
>>>>>>> Duncan Murdoch
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Hiroaki Yutani
>>>>>>>>
>>>>>>>> 2020年12月4日(金) 20:51 Duncan Murdoch <murdoch.dun...@gmail.com>:
>>>>>>>>>
>>>>>>>>> Just saw this on the R-devel news:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> R now provides a simple native pipe syntax ‘|>’ as well as a shorthand
>>>>>>>>> notation for creating functions, e.g. ‘\(x) x + 1’ is parsed as
>>>>>>>>> ‘function(x) x + 1’. The pipe implementation as a syntax 
>>>>>>>>> transformation
>>>>>>>>> was motivated by suggestions from Jim Hester and Lionel Henry. These
>>>>>>>>> features are experimental and may change prior to release.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is a good addition; by using "|>" instead of "%>%" there should 
>>>>>>>>> be
>>>>>>>>> a chance to get operator precedence right.  That said, the ?Syntax 
>>>>>>>>> help
>>>>>>>>> topic hasn't been updated, so I'm not sure where it fits in.
>>>>>>>>>
>>>>>>>>> There are some choices that take a little getting used to:
>>>>>>>>>
>>>>>>>>>    > mtcars |> head
>>>>>>>>> Error: The pipe operator requires a function call or an anonymous
>>>>>>>>> function expression as RHS
>>>>>>>>>
>>>>>>>>> (I need to say mtcars |> head() instead.)  This sometimes leads to 
>>>>>>>>> error
>>>>>>>>> messages that are somewhat confusing:
>>>>>>>>>
>>>>>>>>>    > mtcars |> magrittr::debug_pipe |> head
>>>>>>>>> Error: function '::' not supported in RHS call of a pipe
>>>>>>>>>
>>>>>>>>> but
>>>>>>>>>
>>>>>>>>> mtcars |> magrittr::debug_pipe() |> head()
>>>>>>>>>
>>>>>>>>> works.
>>>>>>>>>
>>>>>>>>> Overall, I think this is a great addition, though it's going to be
>>>>>>>>> disruptive for a while.
>>>>>>>>>
>>>>>>>>> Duncan Murdoch
>>>>>>>>>
>>>>>>>>> ______________________________________________
>>>>>>>>> R-devel@r-project.org mailing list
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> R-devel@r-project.org mailing list
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>>
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-devel@r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel@r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>> ______________________________________________
>>>>> R-devel@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>>
>>>>
>>>> --
>>>> Statistics & Software Consulting
>>>> GKX Group, GKX Associates Inc.
>>>> tel: 1-877-GKX-GROUP
>>>> email: ggrothendieck at gmail.com
>>>
>>>
>>>
>>> --
>>> Statistics & Software Consulting
>>> GKX Group, GKX Associates Inc.
>>> tel: 1-877-GKX-GROUP
>>> email: ggrothendieck at gmail.com
>>>
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to