On Jan 4, 2011, at 22:35 , John Fox wrote: > Dear r-devel list members, > > On a couple of occasions I've encountered the issue illustrated by the > following examples: > > --------- snip ----------- > >> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed + > + Armed.Forces + Population + Year, data=longley) > >> mod.2 <- update(mod.1, . ~ . - Year + Year) > >> all.equal(mod.1, mod.2) > [1] TRUE >> >> f <- function(mod){ > + subs <- 1:10 > + update(mod, subset=subs) > + } > >> f(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > >> f(mod.2) > Error in eval(expr, envir, enclos) : object 'subs' not found > > --------- snip ----------- > > I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or > the formulas therein, are associated with different environments, but I > don't quite see why. > > Anyway, here are two "solutions" that work, but neither is in my view > desirable: > > --------- snip ----------- > >> f1 <- function(mod){ > + assign(".subs", 1:10, envir=.GlobalEnv) > + on.exit(remove(".subs", envir=.GlobalEnv)) > + update(mod, subset=.subs) > + } > >> f1(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > >> f1(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > >> f2 <- function(mod){ > + env <- new.env(parent=.GlobalEnv) > + attach(NULL) > + on.exit(detach()) > + assign(".subs", 1:10, pos=2) > + update(mod, subset=.subs) > + } > >> f2(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > >> f2(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > --------- snip ----------- > > The problem with f1() is that it will clobber a variable named .subs in the > global environment; the problem with f2() is that .subs can be masked by a > variable in the global environment. > > Is there a better approach?
I think the best way would be to modify the environment of the formula. Something like the below, except that it doesn't actually work... f3 <- function(mod) { f <- formula(mod) environment(f) <- e <- new.env(parent=environment(f)) mod <- update(mod, formula=f) evalq(.subs <- 1:10, e) update(mod, subset=.subs) } The catch is that it is not quite so easy to update the formula of a model. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel