On 01/04/2015 1:35 PM, Gabriel Becker wrote:
Joris,
The second argument to evalq is envir, so that line says, roughly, "call
environment() to generate me a new environment within the environment
defined by data".
I think that's not quite right. environment() returns the current
environment, it doesn't create a new one. It is evalq() that created a
new environment from data, and environment() just returns it.
Here's what happens. I've put the code first, the description of what
happens on the line below.
parent <- parent.frame()
Get the environment from which within.data.frame was called.
e <- evalq(environment(), data, parent)
Create a new environment containing the columns of data, with the parent
being the environment where we were called.
Return it and store it in e.
eval(substitute(expr), e)
Evaluate the expression in this new environment.
l <- as.list(e)
Convert it to a list.
l <- l[!vapply(l, is.null, NA, USE.NAMES = FALSE)]
Delete NULL entries from the list.
nD <- length(del <- setdiff(names(data), (nl <- names(l))))
Find out if any columns were deleted.
data[nl] <- l
Set the columns of data to the values from the list.
if (nD)
data[del] <- if (nD == 1)
NULL
else vector("list", nD)
data
Delete the columns from data which were deleted from the list.
Note that that is is only generating e, the environment that expr will be
evaluated within in the next line (the call to eval). This means that expr
is evaluated in an environment which is inside the environment defined by
data, so you get non-standard evaluation in that symbols defined in data
will be available to expr earlier in symbol lookup than those in the
environment that within() was called from.
This again sounds like there are two environments created, when really
there's just one, but the last part is correct.
Duncan Murdoch
This is easy to confirm from the behavior of these functions:
> df = data.frame(x = 1:10, y = rnorm(10))
> x = "I'm a character"
> mean(x)
[1] NA
Warning message:
In mean.default(x) : argument is not numeric or logical: returning NA
> within(df, mean.x <- mean(x))
x y mean.x
1 1 0.396758869 5.5
2 2 0.945679050 5.5
3 3 1.980039723 5.5
4 4 -0.187059706 5.5
5 5 0.008220067 5.5
6 6 0.451175885 5.5
7 7 -0.262064017 5.5
8 8 -0.652301191 5.5
9 9 0.673609455 5.5
10 10 -0.075590905 5.5
> with(df, mean(x))
[1] 5.5
P.S. this is probably an r-help question.
Best,
~G
On Wed, Apr 1, 2015 at 10:21 AM, Joris Meys <jorism...@gmail.com> wrote:
> Dear list members,
>
> I'm a bit confused about the evaluation of expressions using with() or
> within() versus subset() and transform(). I always teach my students to use
> with() and within() because of the warning mentioned in the helppages of
> subset() and transform(). Both functions use nonstandard evaluation and are
> to be used only interactively.
>
> I've never seen that warning on the help page of with() and within(), so I
> assumed both functions can safely be used in functions and packages. I've
> now been told that both functions pose the same risk as subset() and
> transform().
>
> Looking at the source code I've noticed the extra step:
>
> e <- evalq(environment(), data, parent)
>
> which, at least according to my understanding, should ensure that the
> functions follow the standard evaluation rules. Could somebody with more
> knowledge than I have shed a bit of light on this issue?
>
> Thank you
> Joris
>
> --
> Joris Meys
> Statistical consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
>
> tel : +32 (0)9 264 61 79
> joris.m...@ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel