Olaf Mersmann <[EMAIL PROTECTED]> writes: > Hello all, > > I'm new to R (and the S language in general) so go easy on me if this is really > simple. > > Given a data.frame df which looks like this: > f1 f2 f3 f4 c1 c2 > 1 y y a b 10 20 > 2 n y b a 20 20 > 3 n n b b 8 10 > 4 y n a a 30 5 > > I'd like to aggregate it by the factors f1 and f2 (or f2 and f3, or any other > combination of the three) and compute the sum of c1 and c2 (as separate values). I > can do this just fine as long as there is only one column with counts using tapply > of mApply out of Hmisc, but I've been unable to come up with a solution that works > with two or more columns. > > In SQL a query to achieve this would look something like this: > SELECT f1, f2, sum(c1), sum(2) FROM df GROUP BY f1, f2 > > An hints on how this is done efficiently in R would be greatly appreciated.
I think aggregate() will do what you want. If not, notice that whatever you can do with a single factor, you can also do with interaction(f1,f2) or maybe interaction(f1,f2, drop=TRUE). -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
