Does this do what you want? It creates a new dataframe with those 'mg' that have at least a certain number of observation.
> set.seed(2) > # create some test data > x <- data.frame(mg=sample(LETTERS[1:4], 20, TRUE), data=1:20) > # split the data into subsets based on 'mg' > x.split <- split(x, x$mg) > str(x.split) List of 4 $ A:'data.frame': 7 obs. of 2 variables: ..$ mg : Factor w/ 4 levels "A","B","C","D": 1 1 1 1 1 1 1 ..$ data: int [1:7] 1 4 7 12 14 18 20 $ B:'data.frame': 3 obs. of 2 variables: ..$ mg : Factor w/ 4 levels "A","B","C","D": 2 2 2 ..$ data: int [1:3] 9 15 19 $ C:'data.frame': 4 obs. of 2 variables: ..$ mg : Factor w/ 4 levels "A","B","C","D": 3 3 3 3 ..$ data: int [1:4] 2 3 10 11 $ D:'data.frame': 6 obs. of 2 variables: ..$ mg : Factor w/ 4 levels "A","B","C","D": 4 4 4 4 4 4 ..$ data: int [1:6] 5 6 8 13 16 17 > # only choose subsets with at 5 observations > x.5 <- lapply(x.split, function(a) { + if (nrow(a) >= 5) return(a) + else return(NULL) + }) > # create new dataframe with these observations > x.new <- do.call('rbind', x.5) > x.new mg data A.1 A 1 A.4 A 4 A.7 A 7 A.12 A 12 A.14 A 14 A.18 A 18 A.20 A 20 D.5 D 5 D.6 D 6 D.8 D 8 D.13 D 13 D.16 D 16 D.17 D 17 > > On 8/9/07, Ron Crump <[EMAIL PROTECTED]> wrote: > Hi, > > I generally do my data preparation externally to R, so I > this is a bit unfamiliar to me, but a colleague has asked > me how to do certain data manipulations within R. > > Anyway, basically I can get his large file into a dataframe. > One of the columns is a management group code (mg). There may be > varying numbers of observations per management group, and > he would like to subset the dataframe such that there are > always at least n per management group. > > I presume I can get to this using table or tapply, then > (and I'm not sure how on this bit) creating a column nmg > containing the number of observations that corresponds to > mg for that row, then simply subsetting. > > So, am I on the right track? If so how do I actually do it, and > is there an easier method than I am considering. > > Thanks for your help, > Ron > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.