A very similar question was asked a couple of days ago - see the thread titled "Removing rows in dataframe w'o duplicated values" - in particular, the responses by Dimitris Rizopoulos and David Winsemius. The adaptation to this problem is
df[ave(as.numeric(df$group), as.numeric(df$group), FUN = length) > 4, ] group x 1 A 3.903747 2 A 3.599547 3 A 2.449991 4 A 2.740639 5 A 4.268988 6 B 8.649600 7 B 5.493841 8 B 1.892154 9 B 6.781754 10 B 1.459250 11 B 6.749522 HTH, Dennis On Thu, Nov 24, 2011 at 4:02 AM, Johannes Radinger <jradin...@gmx.at> wrote: > Hello, > > assume we have following dataframe: > > group <-c(rep("A",5),rep("B",6),rep("C",4)) > x <- c(runif(5,1,5),runif(6,1,10),runif(4,2,15)) > df <- data.frame(group,x) > > Now I want to select all cases (rows) for those groups > which have more or equal 5 cases (so I want to select > all cases of group A and B). > How can I use the indexing for such questions? > > df[??]... I think it is probably quite easy but I really > don't know how to do that at the moment. > > maybe someone can help me... > > /johannes > -- > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.