Re: [Rd] Multiple Assignment built into the R Interpreter?

Duncan Murdoch Sun, 12 Mar 2023 03:07:06 -0700

I really like it!  Nicely done.

Duncan Murdoch



On 11/03/2023 6:00 p.m., Kevin Ushey wrote:

FWIW, it's possible to get fairly close to your proposed semantics
using the existing metaprogramming facilities in R. I put together a
prototype package here to demonstrate:

     https://github.com/kevinushey/dotty

The package exports an object called `.`, with a special `[<-.dot` S3
method which enables destructuring assignments. This means you can
write code like:

     .[nr, nc] <- dim(mtcars)

and that will define 'nr' and 'nc' as you expect.

As for R CMD check warnings, you can suppress those through the use of
globalVariables(), and that can also be automated within the package.
The 'dotty' package includes a function 'dotify()' which automates
looking for such usages in your package, and calling globalVariables()
so that R CMD check doesn't warn. In theory, a similar technique would
be applicable to other packages defining similar operators (zeallot,
collapse).

Obviously, globalVariables() is a very heavy hammer to swing for this
issue, but you might consider the benefits worth the tradeoffs.

Best,
Kevin

On Sat, Mar 11, 2023 at 2:53 PM Duncan Murdoch <murdoch.dun...@gmail.com> wrote:


On 11/03/2023 4:42 p.m., Sebastian Martin Krantz wrote:

Thanks Duncan and Ivan for the careful thoughts. I'm not sure I can
follow all aspects you raised, but to give my limited take on a few:

your proposal violates a very basic property of the  language, i.e. that all 
statements are expressions and have a value.  > What's the value of 1 + (A, C = 
init_matrices()).


I'm not sure I see the point here. I evaluated 1 + (d = dim(mtcars); nr
= d[1]; nc = d[2]; rm(d)), which simply gives a syntax error,



    d = dim(mtcars); nr = d[1]; nc = d[2]; rm(d)

is not a statement, it is a sequence of 4 statements.

Duncan Murdoch

   as the

above expression should. `%=%` assigns to
environments, so 1 + (c("A", "C") %=% init_matrices()) returns
numeric(0), with A and C having their values assigned.

suppose f() returns list(A = 1, B = 2) and I do  > B, A <- f() > Should 
assignment be by position or by name?


In other languages this is by position. The feature is not meant to
replace list2env(), and being able to rename objects in the assignment
is a vital feature of codes
using multi input and output functions e.g. in Matlab or Julia.

Honestly, given that this is simply syntactic sugar, I don't think I would 
support it.


You can call it that, but it would be used by almost every R user almost
every day. Simple things like nr, nc = dim(x); values, vectors =
eigen(x) etc. where the creation of intermediate objects
is cumbersome and redundant.

I see you've already mentioned it ("JavaScript-like"). I think it would  fulfil 
Sebastian's requirements too, as long as it is considered "true assignment" by the rest 
of the language.


I don't have strong opinions about how the issue is phrased or
implemented. Something like [t, n] = dim(x) might even be more clear.
It's important though that assignment remains by position,
so even if some output gets thrown away that should also be positional.

  A <- 0  > [A, B = A + 10] <- list(1, A = 2)


I also fail to see the use of allowing this. something like this is an
error.

A = 2
(B = A + 1) <- 1

Error in (B = A + 1) <- 1 : could not find function "(<-"

Regarding the practical implementation, I think `collapse::%=%` is a
good starting point. It could be introduced in R as a separate function,
or `=` could be modified to accommodate its capability. It should be
clear that
with more than one LHS variables the assignment is an environment level
operation and the results can only be used in computations once assigned
to the environment, e.g. as in 1 + (c("A", "C") %=% init_matrices()),
A and C are not available for the addition in this statement. The
interpretor then needs to be modified to read something like nr, nc =
dim(x) or [nr, nc] = dim(x). as an environment-level multiple assignment
operation with no
immediate value. Appears very feasible to my limited understanding, but
I guess there are other things to consider still. Definitely appreciate
the responses so far though.

Best regards,

Sebastian





On Sat, 11 Mar 2023 at 20:38, Duncan Murdoch <murdoch.dun...@gmail.com
<mailto:murdoch.dun...@gmail.com>> wrote:

     On 11/03/2023 11:57 a.m., Ivan Krylov wrote:
      > On Sat, 11 Mar 2023 11:11:06 -0500
      > Duncan Murdoch <murdoch.dun...@gmail.com
     <mailto:murdoch.dun...@gmail.com>> wrote:
      >
      >> That's clear, but your proposal violates a very basic property
     of the
      >> language, i.e. that all statements are expressions and have a value.
      >
      > How about reframing this feature request from multiple assignment
      > (which does go contrary to "everything has only one value, even
     if it's
      > sometimes invisible(NULL)") to "structured binding" / "destructuring
      > assignment" [*], which takes this single single value returned by the
      > expression and subsets it subject to certain rules? It may be
     easier to
      > make a decision on the semantics for destructuring assignment (e.g.
      > languages which have this feature typically allow throwing unneeded
      > parts of the return value away), and it doesn't seem to break as much
      > of the rest of the language if implemented.
      >
      > I see you've already mentioned it ("JavaScript-like"). I think it
     would
      > fulfil Sebastian's requirements too, as long as it is considered
     "true
      > assignment" by the rest of the language.
      >
      > The hard part is to propose the actual grammar of the new feature (in
      > terms of src/main/gram.y, preferably without introducing
     conflicts) and
      > its semantics (including the corner cases, some of which you have
      > already mentioned). I'm not sure I'm up to the task.
      >

     If I were doing it, here's what I'd propose:

         '[' formlist ']' LEFT_ASSIGN expr
         '[' formlist ']' EQ_ASSIGN expr
         expr RIGHT_ASSIGN  '[' formlist ']'

     where `formlist` has the syntax of the formals list for a function
     definition.  This would have the following semantics:

          {
            *tmp* <- expr

            # For arguments with no "default" expression,

            argname1 <- *tmp*[[1]]
            argname2 <- *tmp*[[2]]
            ...

            # For arguments with a default listed

            argname3 <- with(*tmp*, default3)
          }


     The value of the whole thing would therefore be (invisibly) the
     value of
     the last item in the assignment.

     Two examples:

         [A, B, C] <- expr   # assign the first three elements of expr to A,
     B, and C

         [A, B, C = a + b] <- expr  # assign the first two elements of expr
                                    # to A and B,
                                    # assign with(expr, a + b) to C.

     Unfortunately, I don't think this could be done entirely by
     transforming
     the expression (which is the way |> was done), and that makes it a lot
     harder to write and to reason about.  E.g. what does this do?

         A <- 0
         [A, B = A + 10] <- list(1, A = 2)

     According to the recipe above, I think it sets A to 1 and B to 12, but
     maybe a user would expect B to be 10 or 11.  And according to that
     recipe this is an error:

         [A, B = A + 10] <- c(1, A = 2)

     which probably isn't what a user would expect, given that this is fine:

         [A, B] <- c(1, 2)

     Duncan Murdoch


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Multiple Assignment built into the R Interpreter?

Reply via email to