Hi all,
I frequently encounter datasets that require me to repeat the same calculation
across many variables. For example, given a dataset with total employment
variables and manufacturing employment variables for the years 1990-2010, I
might have to calculate manufacturing's share of total employment in each year.
I find it cumbersome to have to manually define a share for each year and would
like to know how others might handle this kind of task.
For example, given the data frame:
df<-data.frame(a1=1:10, a2=11:20, a3=21:30, b1=101:110, b2=111:120, b3=121:130)
I'd like to append new variables--c1, c2, and c3--to the data frame that are
the result of a1/b1, a2/b2, and a3/b3, respectively.
When there are only a few of these variables, I don't really have a problem,
but it becomes a chore when the number of variables increases. Is there a way I
can do this kind of processing using a loop? I tried defining a vector to hold
the names for the "c variables" (e.g. c1,c2, ... cn) and creating new variables
in a loop using code like:
avars<-c("a1","a2","a3")
bvars<-c("b1","b2","b3")
cvars<-c("c1","c2","c3")
for(i in 1:3){
df$cvars[i]<-df$avars[i]/df$bvars[i]
}
But the variable references don't resolve properly with this particular syntax.
Any help would be much appreciated. Cheers.
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.