Grouping the data frame by the first two columns, apply colMeans and then rbind the resulting by-structure together:
do.call(rbind, by(DF, DF[2:1], colMeans, na.rm = TRUE)) On 10/5/06, Greg Tarpinian <[EMAIL PROTECTED]> wrote: > R 2.3.1, WinXP: > > I have a puzzling problem that I suspect may be solved using > grep or a regular expression but am at a loss how to actually do it... > My data.frame looks like > > Location Time X Y > -------- ---- --- --- > 1 0 1.6 9.3 > 1 3 4.2 10.4 > 1 6 2.7 16.3 > 2 0 0.5 2.1 > 2 3 NA 3.6 > 2 3 5.0 0.06 > 2 6 3.4 14.0 > > and so forth. I would like to search for duplicate Time values > within a Location and take the numerical average (where possible) > of the elements in X and Y. These numerical averages should > then be used to create a single row where multiple rows once > existed. So, I would like to obtain > > 2 3 5.0 1.83 > > for the two Time = 3 rows for Location = 2 and use it to replace > these two rows. Ideally, avoiding for(i in 1:blah) loops would be > nice because the data.frame has about 10,000 rows that need to > be searched and processed. My intent is to do some comparing of > SAS to R -- the DATA step processing in SAS is quite fast and > using the RETAIN statement along with the LAG( ) function allows > this sort of thing to be done rapidly. > > > Thanks in advance, > > Greg > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
