Michael Wither <[email protected]> wrote on 01/28/2012 03:18:29
PM:
> [image removed]
>
> Re: [R] R extracting regression coefficients from multiple
> regressions using lapply command
>
> Michael Wither
>
> to:
>
> Jean V Adams
>
> 01/28/2012 03:18 PM
>
> Hi, this code is actually great (much better than any other response
> I've gotten or seen). But I have one more question. This puts the
> regression coefficients for state 1 in row 1, which is what I do
> need, but then it puts coef1 from the first regression in column 1,
> then coef1 from the second regression in column 2, then coef1 from
> the third regression in column 3, ... What I need is coef 1 from
> first regression in column 1, coef2 from regression 1 in column 2,
> coef3 from regression 1 in column 3, ... And then after the first 28
> columns are filled in (27 covariate plus the constant term), I'd
> like coef1 from the 2nd regression to go in column 29, coef2 from
> the 2nd regression to go in column 30,...
> Does this make sense? Do you know how to do this?
> Thank you again so much for your help,
> Michael
Adjust the dimensions of your array, and fill in the data accordingly ...
coefs <- array(NA, dim=c(nstates, nxs+1, nys))
for(i in 1:nstates) {
for(k in 1:nys) {
coefs[i, , k] <- lm(formula(paste(names(PUF)[k],
covariates, sep=" ~ ")),
data=PUF[PUF$state==i, ])$coef
}}
Jean
> On Fri, Jan 27, 2012 at 1:05 AM, Michael Wither
<[email protected]
> > wrote:
> Thanks, this does help a bit. I'll keep on trying to figure it out.
> Thanks ,
> Michael
>
> On Thu, Jan 26, 2012 at 6:42 AM, Jean V Adams <[email protected]> wrote:
>
> Michael Wither wrote on 01/26/2012 12:08:19 AM:
>
>
> > Hi, I have a question about running multiple in regressions in R and
then
> > storing the coefficients. I have a large dataset with several
variables,
> > one of which is a state variable, coded 1-50 for each state. I'd like
to
> > run a regression of 28 select variables on the remaining 27 variables
of
> > the dataset (there are 55 variables total), and specific for each
state, ie
> > run a regression of variable1 on covariate1, covariate2, ...,
covariate27
> > for observations where state==1. I'd then like to repeat this for
variable1
> > for states 2-50, and the repeat the whole process for variable2,
> > variable3,..., variable28. I think I've written the correct R code to
do
> > this, but the next thing I'd like to do is extract the coefficients,
> > ideally into a coefficient matrix. Could someone please help me with
this?
> > Here's the code I've written so far, I'm not sure if this is the best
way
> > to do this. Please help me.
> >
> > for (num in 1:50) {
> >
> > #PUF is the data set I'm using
> >
> > #Subset the data by states
> > PUFnum <- subset(PUF, state==num)
> >
> > #Attach data set with state specific data
> > attach(PUFnum)
> >
> > #Run our prediction regression
> > #the variables class1 through e19700 are the 27 covariates I want
to use
> > regression <- lapply(PUFnum, function(z) lm(z ~
> > class1+class2+class3+class4+class5+class6+class7+xtot+e00200+e00300
> > +e00600+e00900+e01000+p04470+e04800+e09600+e07180+e07220+e07260
> > +e06500+e10300+e59720+e11900+e18425+e18450+e18500+e19700))
> >
> > Beta <- lapply(regression, function(d) d<- coef(regression$d))
> >
> > detach(PUFnum)
> >
> > }
> > Thanks,
> > Mike
>
>
> This should help you get started.
>
> # You don't provide any sample data, so I made some up myself
> nstates <- 5
> nobs <- 30
> nys <- 3
> nxs <- 4
> PUF <- data.frame(matrix(rnorm(nstates*nobs*(nys+nxs)),
nrow=nstates*nobs,
> dimnames=list(NULL, c(paste("Y", 1:nys, sep=""), paste("X",
> 1:nxs, sep="")))))
> PUF$state <- rep(1:nstates, nobs)
> head(PUF)
>
> # create a character vector of all your covariate names
> # separated by a plus sign
> # this will serve as the right half of your regression equations
> covariates <- paste(names(PUF)[nys + (1:nxs)], collapse=" + ")
>
> # create an empty array to be filled with coefficients
> coefs <- array(NA, dim=c(nstates, nys, nxs+1))
>
> # fill the array with coefficients
> # this will work for you if the first 28 columns of your PUF
> # data frame are the response variables
> for(i in 1:nstates) {
> for(j in 1:nys) {
> coefs[i, j, ] <- lm(formula(paste(names(PUF)[j], covariates,
> sep=" ~ ")),
> data=PUF[PUF$state==i, ])$coef
> }}
> coefs
>
> Jean
>
>
> --
> Michael J. Wither
> 2884 Torrey Pines Road
> La Jolla, CA 92037
> (216) 970-5036 (cell)
> [email protected]
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.