Hi, I generally do my data preparation externally to R, so I this is a bit unfamiliar to me, but a colleague has asked me how to do certain data manipulations within R.
Anyway, basically I can get his large file into a dataframe. One of the columns is a management group code (mg). There may be varying numbers of observations per management group, and he would like to subset the dataframe such that there are always at least n per management group. I presume I can get to this using table or tapply, then (and I'm not sure how on this bit) creating a column nmg containing the number of observations that corresponds to mg for that row, then simply subsetting. So, am I on the right track? If so how do I actually do it, and is there an easier method than I am considering. Thanks for your help, Ron ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.