Re: [R] using lm() with variable formula
I was solving similar problem some time ago. Here is my script. I had a data frame, containing a response and several other variables, which were assumed predictors. I was trying to choose the best linear approximation. This approach now seems to me useless, please, don't blame me for that. However, the script might be useful to you. code library(forward) # dfr is a data.frame, that contains everything. # The response variable is named med5x # The following lines construct linear models for all possibe formulas # of the form # med5x~T+a+height # med5x~a+height+RH # T, a, RH, etc are the names of possible predictors inputs-names(dfr)[c(10:30,1)] # dfr was a very large data frame, containing lot of variables. # here we have chosen only a subset of them. for(nc in 11:length(inputs)){ # the linear models were assumed to have at least 11 terms # now we are generating character vectors containing formulas. formulas-paste(med5x,sep=~, fwd.combn(inputs,nc,fun=function(x){paste(x,collapse=+)})) # and then, are trying to fit every for(f in formulas){ lms-lm(eval(parse(text=f)),data=dfr) cat(file=linear_models.txt,f,sum(residuals(lms)^2),\n,sep=\t,append=TRUE) } } /code Hmm, looking back, I see that this is rather inefficient script. For example, the inner cycle can easily be replaced with the apply function. Chris Elsaesser wrote: New to R; please excuse me if this is a dumb question. I tried to RTFM; didn't help. I want to do a series of regressions over the columns in a data.frame, systematically varying the response variable and the the terms; and not necessarily including all the non-response columns. In my case, the columns are time series. I don't know if that makes a difference; it does mean I have to call lag() to offset non-response terms. I can not assume a specific number of columns in the data.frame; might be 3, might be 20. My central problem is that the formula given to lm() is different each time. For example, say a data.frame had columns with the following headings: height, weight, BP (blood pressure), and Cals (calorie intake per time frame). In that case, I'd need something like the following: lm(height ~ weight + BP + Cals) lm(height ~ weight + BP) lm(height ~ weight + Cals) lm(height ~ BP + Cals) lm(weight ~ height + BP) lm(weight ~ height + Cals) etc. In general, I'll have to read the header to get the argument labels. Do I have to write several functions, each taking a different number of arguments? I'd like to construct a string or list representing the varialbes in the formula and apply lm(), so to say [I'm mainly a Lisp programmer where that part would be very simple. Anyone have a Lisp API for R? :-}] -- View this message in context: http://www.nabble.com/using-lm%28%29-with-variable-formula-tf3772540.html#a10716815 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using lm() with variable formula [Broadcast]
One way to do it is by giving a data frame with the right variables to lm() as the first argument each time. If lm() is given a data frame as the first argument, it will treat the first variable as the LHS and the rest as the RHS of the formula. As examples, you can do: lm(myData[c(height, weight, BP, Cals)]) (The drawback to this is that the formula in the fitted model object looks a bit strange...) Andy From: Chris Elsaesser New to R; please excuse me if this is a dumb question. I tried to RTFM; didn't help. I want to do a series of regressions over the columns in a data.frame, systematically varying the response variable and the the terms; and not necessarily including all the non-response columns. In my case, the columns are time series. I don't know if that makes a difference; it does mean I have to call lag() to offset non-response terms. I can not assume a specific number of columns in the data.frame; might be 3, might be 20. My central problem is that the formula given to lm() is different each time. For example, say a data.frame had columns with the following headings: height, weight, BP (blood pressure), and Cals (calorie intake per time frame). In that case, I'd need something like the following: lm(height ~ weight + BP + Cals) lm(height ~ weight + BP) lm(height ~ weight + Cals) lm(height ~ BP + Cals) lm(weight ~ height + BP) lm(weight ~ height + Cals) etc. In general, I'll have to read the header to get the argument labels. Do I have to write several functions, each taking a different number of arguments? I'd like to construct a string or list representing the varialbes in the formula and apply lm(), so to say [I'm mainly a Lisp programmer where that part would be very simple. Anyone have a Lisp API for R? :-}] Thanks, chris Chris Elsaesser, PhD Principal Scientist, Machine Learning SPADAC Inc. 7921 Jones Branch Dr. Suite 600 McLean, VA 22102 703.371.7301 (m) 703.637.9421 (o) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using lm() with variable formula
New to R; please excuse me if this is a dumb question. I tried to RTFM; didn't help. I want to do a series of regressions over the columns in a data.frame, systematically varying the response variable and the the terms; and not necessarily including all the non-response columns. In my case, the columns are time series. I don't know if that makes a difference; it does mean I have to call lag() to offset non-response terms. I can not assume a specific number of columns in the data.frame; might be 3, might be 20. My central problem is that the formula given to lm() is different each time. For example, say a data.frame had columns with the following headings: height, weight, BP (blood pressure), and Cals (calorie intake per time frame). In that case, I'd need something like the following: lm(height ~ weight + BP + Cals) lm(height ~ weight + BP) lm(height ~ weight + Cals) lm(height ~ BP + Cals) lm(weight ~ height + BP) lm(weight ~ height + Cals) etc. In general, I'll have to read the header to get the argument labels. Do I have to write several functions, each taking a different number of arguments? I'd like to construct a string or list representing the varialbes in the formula and apply lm(), so to say [I'm mainly a Lisp programmer where that part would be very simple. Anyone have a Lisp API for R? :-}] Thanks, chris Chris Elsaesser, PhD Principal Scientist, Machine Learning SPADAC Inc. 7921 Jones Branch Dr. Suite 600 McLean, VA 22102 703.371.7301 (m) 703.637.9421 (o) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using lm() with variable formula
Try this: lm(Sepal.Length ~., iris[1:3]) # or cn - c(Sepal.Length, Sepal.Width, Petal.Length) lm(Sepal.Length ~., iris[cn]) On 5/17/07, Chris Elsaesser [EMAIL PROTECTED] wrote: New to R; please excuse me if this is a dumb question. I tried to RTFM; didn't help. I want to do a series of regressions over the columns in a data.frame, systematically varying the response variable and the the terms; and not necessarily including all the non-response columns. In my case, the columns are time series. I don't know if that makes a difference; it does mean I have to call lag() to offset non-response terms. I can not assume a specific number of columns in the data.frame; might be 3, might be 20. My central problem is that the formula given to lm() is different each time. For example, say a data.frame had columns with the following headings: height, weight, BP (blood pressure), and Cals (calorie intake per time frame). In that case, I'd need something like the following: lm(height ~ weight + BP + Cals) lm(height ~ weight + BP) lm(height ~ weight + Cals) lm(height ~ BP + Cals) lm(weight ~ height + BP) lm(weight ~ height + Cals) etc. In general, I'll have to read the header to get the argument labels. Do I have to write several functions, each taking a different number of arguments? I'd like to construct a string or list representing the varialbes in the formula and apply lm(), so to say [I'm mainly a Lisp programmer where that part would be very simple. Anyone have a Lisp API for R? :-}] Thanks, chris Chris Elsaesser, PhD Principal Scientist, Machine Learning SPADAC Inc. 7921 Jones Branch Dr. Suite 600 McLean, VA 22102 703.371.7301 (m) 703.637.9421 (o) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using lm() with variable formula
tmp - data.frame(matrix(rnorm(40),10,4, dimnames=list(NULL, c(Y,A,B,C tmp tmp.form - paste(names(tmp)[1], paste(names(tmp)[-1], collapse= + ), sep= ~ ) tmp.form lm(tmp.form, tmp) The R language is powerful enough to most of the lisp-like things you may want to do. Rich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using lm() with variable formula
... and note that if a matrix of responses is on the left of ~ , separate regressions will be simultaneously fit to each of the columns of the matrix. Note that this **is** in TFM -- ?lm. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: Thursday, May 17, 2007 8:22 AM To: Chris Elsaesser Cc: r-help@stat.math.ethz.ch Subject: Re: [R] using lm() with variable formula Try this: lm(Sepal.Length ~., iris[1:3]) # or cn - c(Sepal.Length, Sepal.Width, Petal.Length) lm(Sepal.Length ~., iris[cn]) On 5/17/07, Chris Elsaesser [EMAIL PROTECTED] wrote: New to R; please excuse me if this is a dumb question. I tried to RTFM; didn't help. I want to do a series of regressions over the columns in a data.frame, systematically varying the response variable and the the terms; and not necessarily including all the non-response columns. In my case, the columns are time series. I don't know if that makes a difference; it does mean I have to call lag() to offset non-response terms. I can not assume a specific number of columns in the data.frame; might be 3, might be 20. My central problem is that the formula given to lm() is different each time. For example, say a data.frame had columns with the following headings: height, weight, BP (blood pressure), and Cals (calorie intake per time frame). In that case, I'd need something like the following: lm(height ~ weight + BP + Cals) lm(height ~ weight + BP) lm(height ~ weight + Cals) lm(height ~ BP + Cals) lm(weight ~ height + BP) lm(weight ~ height + Cals) etc. In general, I'll have to read the header to get the argument labels. Do I have to write several functions, each taking a different number of arguments? I'd like to construct a string or list representing the varialbes in the formula and apply lm(), so to say [I'm mainly a Lisp programmer where that part would be very simple. Anyone have a Lisp API for R? :-}] Thanks, chris Chris Elsaesser, PhD Principal Scientist, Machine Learning SPADAC Inc. 7921 Jones Branch Dr. Suite 600 McLean, VA 22102 703.371.7301 (m) 703.637.9421 (o) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.