Do you by any chance want to sample from each group equally to get an equal representation matrix ? Here is an example of the input :
mydf <- data.frame( value=1:100, value2=rnorm(100), grp=rep( LETTERS[1:4], c(35, 15, 30, 20) ) ) which has 35 observations from A, 15 from B, 30 from C and 20 from D. And here is a function that I wrote: sample.by.group <- function(df, grp, k, replace=FALSE){ if(length(k)==1){ k <- rep(k, length(unique(grp))) } if(!replace && any(k > table(grp))) stop( paste("Cannot take a sample larger than the population when 'replace = FALSE'.\n", "Please specify a value greater than", min(table(grp)), "or use 'replace = TRUE'.\n") ) ind <- model.matrix( ~ -1 + grp ) w.mat <- list(NULL) for(i in 1:ncol(ind)){ w.mat[[i]] <- sample( which( ind[,i]==1 ), k[i], replace=replace ) } out <- df[ unlist(w.mat), ] return(out) } And here are some examples of how to use it : mydf <- mydf[ sample(1:nrow(mydf)), ] # scramble it for fun out1 <- sample.by.group(mydf, mydf$grp, k=10 ) table( out1$grp ) out2 <- sample.by.group(mydf, mydf$grp, k=50, replace=T) # ie bootstrap table( out2$grp ) and you can even do bootstrapping or sampling with weights via: out3 <- sample.by.group(mydf, mydf$grp, k=c(20, 20, 30, 30), replace=T) table( out3$grp ) Regards, Adai On Fri, 2006-03-17 at 16:01 +0000, Dan Bolser wrote: > Hi, > > I have tuples of data in rows of a data.frame, each column is a variable > for the 'items' (one per row). > > One of the variables is the 'size' of the item (row). > > I would like to cut my data.frame into groups such that each group has > the same *total size*. So, assuming that we order by size, some groups > should have several small items while other groups have a few large > items. All the groups should have approximately the same total size. > > I have tried various combinations of cut, quantile, and ecdf, and I just > can't work out how to do this! > > Any help is greatly appreciated! > > All the best, > Dan. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html