On Fri, 2006-02-24 at 08:18 -0800, Matt Crawford wrote: > I am having trouble doing the following. I have a data.frame like > this, where x and y are a variable that I want to do calculations on: > > Name Year x y > ab 2001 15 3 > ab 2001 10 2 > ab 2002 12 8 > ab 2003 7 10 > dv 2002 10 15 > dv 2002 3 2 > dv 2003 1 15 > > Before I do all the other things I need to do with this data, I need > to summarize or collapse the data by name and year. I've found that I > can do things like > nameyear<-interaction(name,year) > dataframe$nameyear<-nameyear > tapply(dataframe$x,dataframe$nameyear,sum) > tapply(dataframe$y,dataframe$nameyear,sum) > and then bind those together. > > But my problem is that I need to somehow retain the original Names in > my collapsed dataset, so that later I can do analyses with the Name > factors. All I can think of is something like > tapply(dataframe$Name,dataframe$nameyear, somefunction?) > but nothing seems to work. > > I'm actually trying to convert a SAS program, and I can't get out of > that mindset. There, it's a simple Proc Means, By Name Year. > > Thanks for any help or suggestions on the right way to go about this. > > Matt Crawford
Matt, Just use aggregate(): > aggregate(MyDF[, 3:4], list(Name = MyDF$Name, Year = MyDF$Year), sum) Name Year x y 1 ab 2001 25 5 2 ab 2002 12 8 3 dv 2002 13 17 4 ab 2003 7 10 5 dv 2003 1 15 See ?aggregate for more information. HTH, Marc Schwartz ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
