Michael Wither wrote on 01/26/2012 12:08:19 AM:
> Hi, I have a question about running multiple in regressions in R and
then
> storing the coefficients. I have a large dataset with several
variables,
> one of which is a state variable, coded 1-50 for each state. I'd like to
> run a regression of 28 select variables on the remaining 27 variables of
> the dataset (there are 55 variables total), and specific for each state,
ie
> run a regression of variable1 on covariate1, covariate2, ...,
covariate27
> for observations where state==1. I'd then like to repeat this for
variable1
> for states 2-50, and the repeat the whole process for variable2,
> variable3,..., variable28. I think I've written the correct R code to do
> this, but the next thing I'd like to do is extract the coefficients,
> ideally into a coefficient matrix. Could someone please help me with
this?
> Here's the code I've written so far, I'm not sure if this is the best
way
> to do this. Please help me.
>
> for (num in 1:50) {
>
> #PUF is the data set I'm using
>
> #Subset the data by states
> PUFnum <- subset(PUF, state==num)
>
> #Attach data set with state specific data
> attach(PUFnum)
>
> #Run our prediction regression
> #the variables class1 through e19700 are the 27 covariates I want to
use
> regression <- lapply(PUFnum, function(z) lm(z ~
> class1+class2+class3+class4+class5+class6+class7+xtot+e00200+e00300
> +e00600+e00900+e01000+p04470+e04800+e09600+e07180+e07220+e07260
> +e06500+e10300+e59720+e11900+e18425+e18450+e18500+e19700))
>
> Beta <- lapply(regression, function(d) d<- coef(regression$d))
>
> detach(PUFnum)
>
> }
> Thanks,
> Mike
This should help you get started.
# You don't provide any sample data, so I made some up myself
nstates <- 5
nobs <- 30
nys <- 3
nxs <- 4
PUF <- data.frame(matrix(rnorm(nstates*nobs*(nys+nxs)), nrow=nstates*nobs,
dimnames=list(NULL, c(paste("Y", 1:nys, sep=""), paste("X", 1:nxs,
sep="")))))
PUF$state <- rep(1:nstates, nobs)
head(PUF)
# create a character vector of all your covariate names
# separated by a plus sign
# this will serve as the right half of your regression equations
covariates <- paste(names(PUF)[nys + (1:nxs)], collapse=" + ")
# create an empty array to be filled with coefficients
coefs <- array(NA, dim=c(nstates, nys, nxs+1))
# fill the array with coefficients
# this will work for you if the first 28 columns of your PUF
# data frame are the response variables
for(i in 1:nstates) {
for(j in 1:nys) {
coefs[i, j, ] <- lm(formula(paste(names(PUF)[j], covariates, sep="
~ ")),
data=PUF[PUF$state==i, ])$coef
}}
coefs
Jean
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.