Hi: You could try something like this:
For illustration, I'll use a data frame that was presented in a recent post to the ggplot2 group. The poster wanted regressions by individual, but you can add more than one grouping variable to the code I show below. It uses the plyr package. library(plyr) ds_test <- structure(list(individual = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L), .Label = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30"), class = "factor"), time = c(0L, 1671L, 1896L, 0L, 105L, 196L, 384L, 582L, 797L, 998L, 1419L, 0L, 290L, 451L, 752L, 0L, 487L, 619L, 820L, 0L, 384L, 463L, 832L, 932L, 1322L, 1688L, 0L, 101L, 390L, 0L, 746L, 761L, 899L, 1118L, 1236L, 1375L, 0L, 544L, 870L, 927L, 1117L, 1870L, 0L, 326L, 383L, 573L, 1326L, 0L, 1572L, 1592L), size = c(2, 2.6, 2.6, 1.2, 1.4, 1.5, 1.6, 1.7, 1.8, 2, 2.2, 1.3, 1.6, 1.5, 1.5, 2.8, 2.8, 2.4, 2.9, 2.1, 2.4, 2.4, 2.4, 2.3, 2.5, 2.4, 6, 5.8, 5.4, 1.1, 1.6, 1.5, 1.5, 1.5, 2.3, 2.3, 3.2, 4.1, 4, 3.9, 4.1, 4.3, 1.2, 2.1, 2.2, 2.2, 3, 2.2, 3, 3.9)), .Names = c("individual", "time", "size"), row.names = c(NA, 50L), class = "data.frame") # Run models by individual and put the results into a list. The advantage # is that one can extract multiple pieces from each component of the list, # if so desired, by writing simple extraction functions using plyr. dlply() is an # apply-like function: the first letter indicates that the input object (first # argument) is a data frame and that the output object after executing # the function is a list (in this case, a list of lists). The anonymous function # in the call performs the desired operation on each generic data subset x. mods <- dlply(ds_test, .(individual), function(x) lm(size ~ time, data = x)) # This function does the actual work within subgroup; since the number # of residuals will vary from group to group, the output of the calling # function has to be a list object of residuals, one component per individual. # The outer function do.call() is intended to collapse the list object into a # vector, and the resulting vector can be attached to the original data frame # with $: res <- function(x) resid(x) ds_test$u <- do.call(c, llply(mods, res)) In your case, where you have multiple grouping factors, you may have to be a little more careful, but the strategy is the same. You could possibly reduce it to a one-liner (untested): ds_test$u <- do.call(c, dlply(ds_test, .(individual), function(x) resid(lm(size ~ time, data = x)))) HTH, Dennis On Wed, Nov 24, 2010 at 4:56 PM, Ray Zhang <lz...@sfu.ca> wrote: > > Hi there, > > I have a huge data set with multiple firms years and other firm > characteristics. I want to run a regression on the dependent variable and > other explanatory variables and calculate the residual terms by grouping the > firms in same year and same industry. > > What I want to do is to divide my obseravtion into sub sample that contains > the observation with same fiscal year(FYEAR=1990) and same firm > characteristic (Industry =1) and run the regression and put the residual > back to the observation by creating a new column. I want to do that for > multiple years and multiple firms. I wonder is that any easy command with > out creating multiple loops? > > Ray > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.