Re: [R] lm looking for weights outside of the user-defined function
From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Friday, October 22, 2010 9:43 PM To: William Dunlap Cc: Dimitri Liakhovitski; r-help Subject: Re: [R] lm looking for weights outside of the user-defined function On Oct 22, 2010, at 12:17 PM, William Dunlap wrote: ... The environment of the formula is the output of environment(formula) which is assigned to the current environment when the formula is created. The modelling functions look for variables (in the formula, weights, and subset arguments) in the order 1) the data argument (usually an environment or a list) 2) environment of the formula When an environment is searched for a name, the search continues through all ancestral environments until the name is found or until you run out of ancestors. You can reassign the environment of a formula. E.g., compare the following two: wr0 - function(formula, MyData, WeightsVector) { + lm(formula, data=MyData, weights=WeightsVector) + } wr1 - function(formula, MyData, WeightsVector) { + environment(formula) - environment() + lm(formula, data=MyData, weights=WeightsVector) + } wr0(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32)) Error in eval(expr, envir, enclos) : object 'WeightsVector' not found The wr0 call created a formula but the weights vector was outside that environment? And that wss because the formula creation was at the stage of evaluation the function arguments when tehy wouldn't see each other? Except this works: xtoy - function(x = 1:2, y=x){y} xtoy() [1] 1 2 I'm trying to figure out what makes the wr0 version fail and that xtoy() function succeed. The basic reason is that lm() calls eval() to evaluate things in the formula, weights, and subset arguments in a nonstandard (but well defined) way and xtoy() uses the standard argument evaluation rules. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -- David. wr1(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32)) Call: lm(formula = formula, data = MyData, weights = WeightsVector) Coefficients: (Intercept) cyl 38.567 -2.966 Reassigning the environment can lead to the sort of surprises that dynamic scoping gives you. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com I've already tried to define the weights outside of the function - and it finds them. But shouldn't it go in this order? 1. Look in the data frame 2. Look in the environment of the user-defined function 3. Look outside. Hey, I only work here, I don't make the rules, I just follow them. I agree that one might guess that to be the search order, but it is not what is documented. -- David Winsemius, MD West Hartford, CT Dimitri On Fri, Oct 22, 2010 at 9:15 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote: Dear R'ers, I am fighting with a problem that is driving me crazy. I use lm in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights-runif(100) data.for.regression-x[1:3] ### Creating function weighted.reg: weighted.reg=function(formula, MyData, filename,WeightsVector) { print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f- lm(formula,MyData,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } ### Running weighted.reg with my data: reg2-weighted.reg(y~., MyData=x, WeightsVector=myweights, filename=TEST.csv) I get an error: Error in eval(expr, envir, enclos) : object 'WeightsVector' not found Notice, that the function correctly prints length(WeightsVector). But it looks like lm is looking for weights (in the 4th line of the function) OUTSIDE the function and does not see WeightsVector. Have you tried putting WeightsVector in the x dataframe? That would seem to reduce the potential for environmental conflation. From the details section of help(lm): All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula. Why is it looking outside the function for the object that has just been defined inside the function? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https
[R] lm looking for weights outside of the user-defined function
Dear R'ers, I am fighting with a problem that is driving me crazy. I use lm in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights-runif(100) data.for.regression-x[1:3] ### Creating function weighted.reg: weighted.reg=function(formula, MyData, filename,WeightsVector) { print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f-lm(formula,MyData,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } ### Running weighted.reg with my data: reg2-weighted.reg(y~., MyData=x, WeightsVector=myweights, filename=TEST.csv) I get an error: Error in eval(expr, envir, enclos) : object 'WeightsVector' not found Notice, that the function correctly prints length(WeightsVector). But it looks like lm is looking for weights (in the 4th line of the function) OUTSIDE the function and does not see WeightsVector. Why is it looking outside the function for the object that has just been defined inside the function? Thank you very much! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm looking for weights outside of the user-defined function
On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote: Dear R'ers, I am fighting with a problem that is driving me crazy. I use lm in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights-runif(100) data.for.regression-x[1:3] ### Creating function weighted.reg: weighted.reg=function(formula, MyData, filename,WeightsVector) { print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f-lm(formula,MyData,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } ### Running weighted.reg with my data: reg2-weighted.reg(y~., MyData=x, WeightsVector=myweights, filename=TEST.csv) I get an error: Error in eval(expr, envir, enclos) : object 'WeightsVector' not found Notice, that the function correctly prints length(WeightsVector). But it looks like lm is looking for weights (in the 4th line of the function) OUTSIDE the function and does not see WeightsVector. Have you tried putting WeightsVector in the x dataframe? That would seem to reduce the potential for environmental conflation. From the details section of help(lm): All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula. Why is it looking outside the function for the object that has just been defined inside the function? David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm looking for weights outside of the user-defined function
David, I undersand - and I am sure what you are suggesting should work. But I just can't understand why it's not grabbing things INSIDE the environment of the formula first. I've already tried to define the weights outside of the function - and it finds them. But shouldn't it go in this order? 1. Look in the data frame 2. Look in the environment of the user-defined function 3. Look outside. Dimitri On Fri, Oct 22, 2010 at 9:15 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote: Dear R'ers, I am fighting with a problem that is driving me crazy. I use lm in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights-runif(100) data.for.regression-x[1:3] ### Creating function weighted.reg: weighted.reg=function(formula, MyData, filename,WeightsVector) { print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f-lm(formula,MyData,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } ### Running weighted.reg with my data: reg2-weighted.reg(y~., MyData=x, WeightsVector=myweights, filename=TEST.csv) I get an error: Error in eval(expr, envir, enclos) : object 'WeightsVector' not found Notice, that the function correctly prints length(WeightsVector). But it looks like lm is looking for weights (in the 4th line of the function) OUTSIDE the function and does not see WeightsVector. Have you tried putting WeightsVector in the x dataframe? That would seem to reduce the potential for environmental conflation. From the details section of help(lm): All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula. Why is it looking outside the function for the object that has just been defined inside the function? David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm looking for weights outside of the user-defined function
On Oct 22, 2010, at 9:18 AM, Dimitri Liakhovitski wrote: David, I undersand - and I am sure what you are suggesting should work. But I just can't understand why it's not grabbing things INSIDE the environment of the formula first. I am not sure that either one of us understand what is meant by the environment of the formula. I've already tried to define the weights outside of the function - and it finds them. But shouldn't it go in this order? 1. Look in the data frame 2. Look in the environment of the user-defined function 3. Look outside. Hey, I only work here, I don't make the rules, I just follow them. I agree that one might guess that to be the search order, but it is not what is documented. -- David Winsemius, MD West Hartford, CT Dimitri On Fri, Oct 22, 2010 at 9:15 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote: Dear R'ers, I am fighting with a problem that is driving me crazy. I use lm in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights-runif(100) data.for.regression-x[1:3] ### Creating function weighted.reg: weighted.reg=function(formula, MyData, filename,WeightsVector) { print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f- lm(formula,MyData,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } ### Running weighted.reg with my data: reg2-weighted.reg(y~., MyData=x, WeightsVector=myweights, filename=TEST.csv) I get an error: Error in eval(expr, envir, enclos) : object 'WeightsVector' not found Notice, that the function correctly prints length(WeightsVector). But it looks like lm is looking for weights (in the 4th line of the function) OUTSIDE the function and does not see WeightsVector. Have you tried putting WeightsVector in the x dataframe? That would seem to reduce the potential for environmental conflation. From the details section of help(lm): All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula. Why is it looking outside the function for the object that has just been defined inside the function? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm looking for weights outside of the user-defined function
As you suggested, David, the code below works. Now I it can find the weights - because they are in the data frame x. But how can I be sure now that it actually grabs the data from the data frame variables and not the data frame x? x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1),myweights=runif(100)) names(x) weighted.reg=function(formula, MyData, filename,WeightsVector) { variables-MyData[1:(length(MyData)-1)]# creating a data frame without the weights print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f-lm(formula,variables,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } reg2-weighted.reg(y~., MyData=x, filename=TEST.csv, WeightsVector=x$myweights) Dimitri On Fri, Oct 22, 2010 at 9:15 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote: Dear R'ers, I am fighting with a problem that is driving me crazy. I use lm in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights-runif(100) data.for.regression-x[1:3] ### Creating function weighted.reg: weighted.reg=function(formula, MyData, filename,WeightsVector) { print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f-lm(formula,MyData,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } ### Running weighted.reg with my data: reg2-weighted.reg(y~., MyData=x, WeightsVector=myweights, filename=TEST.csv) I get an error: Error in eval(expr, envir, enclos) : object 'WeightsVector' not found Notice, that the function correctly prints length(WeightsVector). But it looks like lm is looking for weights (in the 4th line of the function) OUTSIDE the function and does not see WeightsVector. Have you tried putting WeightsVector in the x dataframe? That would seem to reduce the potential for environmental conflation. From the details section of help(lm): All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula. Why is it looking outside the function for the object that has just been defined inside the function? David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm looking for weights outside of the user-defined function
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Friday, October 22, 2010 6:25 AM To: Dimitri Liakhovitski Cc: r-help Subject: Re: [R] lm looking for weights outside of the user-defined function On Oct 22, 2010, at 9:18 AM, Dimitri Liakhovitski wrote: David, I undersand - and I am sure what you are suggesting should work. But I just can't understand why it's not grabbing things INSIDE the environment of the formula first. I am not sure that either one of us understand what is meant by the environment of the formula. The environment of the formula is the output of environment(formula) which is assigned to the current environment when the formula is created. The modelling functions look for variables (in the formula, weights, and subset arguments) in the order 1) the data argument (usually an environment or a list) 2) environment of the formula When an environment is searched for a name, the search continues through all ancestral environments until the name is found or until you run out of ancestors. You can reassign the environment of a formula. E.g., compare the following two: wr0 - function(formula, MyData, WeightsVector) { + lm(formula, data=MyData, weights=WeightsVector) + } wr1 - function(formula, MyData, WeightsVector) { + environment(formula) - environment() + lm(formula, data=MyData, weights=WeightsVector) + } wr0(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32)) Error in eval(expr, envir, enclos) : object 'WeightsVector' not found wr1(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32)) Call: lm(formula = formula, data = MyData, weights = WeightsVector) Coefficients: (Intercept) cyl 38.567 -2.966 Reassigning the environment can lead to the sort of surprises that dynamic scoping gives you. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com I've already tried to define the weights outside of the function - and it finds them. But shouldn't it go in this order? 1. Look in the data frame 2. Look in the environment of the user-defined function 3. Look outside. Hey, I only work here, I don't make the rules, I just follow them. I agree that one might guess that to be the search order, but it is not what is documented. -- David Winsemius, MD West Hartford, CT Dimitri On Fri, Oct 22, 2010 at 9:15 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote: Dear R'ers, I am fighting with a problem that is driving me crazy. I use lm in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights-runif(100) data.for.regression-x[1:3] ### Creating function weighted.reg: weighted.reg=function(formula, MyData, filename,WeightsVector) { print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f- lm(formula,MyData,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } ### Running weighted.reg with my data: reg2-weighted.reg(y~., MyData=x, WeightsVector=myweights, filename=TEST.csv) I get an error: Error in eval(expr, envir, enclos) : object 'WeightsVector' not found Notice, that the function correctly prints length(WeightsVector). But it looks like lm is looking for weights (in the 4th line of the function) OUTSIDE the function and does not see WeightsVector. Have you tried putting WeightsVector in the x dataframe? That would seem to reduce the potential for environmental conflation. From the details section of help(lm): All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula. Why is it looking outside the function for the object that has just been defined inside the function? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm looking for weights outside of the user-defined function
On Oct 22, 2010, at 12:17 PM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Friday, October 22, 2010 6:25 AM To: Dimitri Liakhovitski Cc: r-help Subject: Re: [R] lm looking for weights outside of the user-defined function On Oct 22, 2010, at 9:18 AM, Dimitri Liakhovitski wrote: David, I undersand - and I am sure what you are suggesting should work. But I just can't understand why it's not grabbing things INSIDE the environment of the formula first. I am not sure that either one of us understand what is meant by the environment of the formula. The environment of the formula is the output of environment(formula) which is assigned to the current environment when the formula is created. The modelling functions look for variables (in the formula, weights, and subset arguments) in the order 1) the data argument (usually an environment or a list) 2) environment of the formula When an environment is searched for a name, the search continues through all ancestral environments until the name is found or until you run out of ancestors. You can reassign the environment of a formula. E.g., compare the following two: wr0 - function(formula, MyData, WeightsVector) { + lm(formula, data=MyData, weights=WeightsVector) + } wr1 - function(formula, MyData, WeightsVector) { + environment(formula) - environment() + lm(formula, data=MyData, weights=WeightsVector) + } wr0(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32)) Error in eval(expr, envir, enclos) : object 'WeightsVector' not found The wr0 call created a formula but the weights vector was outside that environment? And that wss because the formula creation was at the stage of evaluation the function arguments when tehy wouldn't see each other? Except this works: xtoy - function(x = 1:2, y=x){y} xtoy() [1] 1 2 I'm trying to figure out what makes the wr0 version fail and that xtoy() function succeed. -- David. wr1(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32)) Call: lm(formula = formula, data = MyData, weights = WeightsVector) Coefficients: (Intercept) cyl 38.567 -2.966 Reassigning the environment can lead to the sort of surprises that dynamic scoping gives you. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com I've already tried to define the weights outside of the function - and it finds them. But shouldn't it go in this order? 1. Look in the data frame 2. Look in the environment of the user-defined function 3. Look outside. Hey, I only work here, I don't make the rules, I just follow them. I agree that one might guess that to be the search order, but it is not what is documented. -- David Winsemius, MD West Hartford, CT Dimitri On Fri, Oct 22, 2010 at 9:15 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote: Dear R'ers, I am fighting with a problem that is driving me crazy. I use lm in my user-defined function, but it seems to be looking for weights outside of my function's environment: ### Generating example data: x-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1)) myweights-runif(100) data.for.regression-x[1:3] ### Creating function weighted.reg: weighted.reg=function(formula, MyData, filename,WeightsVector) { print(dim(MyData)) print(filename) print(length(WeightsVector)) regr.f- lm(formula,MyData,weights=WeightsVector,na.action=na.omit) results-as.data.frame(round(summary(regr.f)$coeff,3)) write.csv(results,file=filename) return(results) } ### Running weighted.reg with my data: reg2-weighted.reg(y~., MyData=x, WeightsVector=myweights, filename=TEST.csv) I get an error: Error in eval(expr, envir, enclos) : object 'WeightsVector' not found Notice, that the function correctly prints length(WeightsVector). But it looks like lm is looking for weights (in the 4th line of the function) OUTSIDE the function and does not see WeightsVector. Have you tried putting WeightsVector in the x dataframe? That would seem to reduce the potential for environmental conflation. From the details section of help(lm): All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula. Why is it looking outside the function for the object that has just been defined inside the function? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html