There are functions to do weighted summary statistics in the Hmisc package (wtd.quantile, ...).
For more complicated analyses (but not plots yet) the biglm package has a bigglm function that expects the data in chunks, you could write a function that expand parts of the dataset at a time. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rick Bischoff Sent: Wednesday, August 30, 2006 8:28 AM To: r-help@stat.math.ethz.ch Subject: [R] working with summarized data The data sets I am working with all have a weight variable--e.g., each row doesn't mean 1 observation. With that in mind, nearly all of the graphs and summary statistics are incorrect for my data, because they don't take into account the weight. **** For example "median" is incorrect, as the quantiles aren't calculated with weights: sum( weights[X < median(X)] ) / sum(weights) This should be 0.5... of course it's not. **** Unfortunately, it seems that most(all?) of R's graphics and summary statistic functions don't take a weight or frequency argument. (Fortunately the models do...) Am I completely missing how to do this? One way would be to replicate each row proportional to the weight (e.g. if the weight was 4, we would 3 additional copies) but this will get prohibitive pretty quickly as the dataset grows. Thanks in advance! ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.