Hi Cameron You need to be more specific when you ask a question so you can get a better answer. Anyhow, when you say that you want to retain all the other variables do you mean that you want to create a new column in the dataset that contains the calculated sum? If that is the case you can use a construction like:
set.seed(1) step4<-data.frame(TRIPID=rep(c(111,222,333),3),CONVUNIT=rpois(9,40)) result<-tapply(step4$CONVUNIT,INDEX=step4$TRIPID,FUN=sum) step4[,"SUM"]=result[match(step4[,"TRIPID"],names(result))] step4 TRIPID CONVUNIT Sum 1 111 36 122 2 222 48 121 3 333 48 129 4 111 42 122 5 222 30 121 6 333 43 129 7 111 44 122 8 222 43 121 9 333 38 129 Cheers Francisco >From: "Guenther, Cameron" <[EMAIL PROTECTED]> >To: "Francisco J. Zagmutt" <[EMAIL PROTECTED]> >Subject: RE: [R] Unique? >Date: Thu, 11 May 2006 12:08:31 -0400 > >It is close but not quite what I want. I need to retain all of the >other variables as well. > > >Cameron Guenther, Ph.D. >Associate Research Scientist >FWC/FWRI, Marine Fisheries Research >100 8th Avenue S.E. >St. Petersburg, FL 33701 >(727)896-8626 Ext. 4305 >[EMAIL PROTECTED] >-----Original Message----- >From: Francisco J. Zagmutt [mailto:[EMAIL PROTECTED] >Sent: Wednesday, May 10, 2006 6:06 PM >To: Guenther, Cameron; [email protected] >Subject: RE: [R] Unique? > >If you only care about the sum of CONVUNIT by each TRIPID then you can >use tapply i.e.: > >step4<-data.frame(TRIPID=rep(c(111,222,333),3),CONVUNIT=rpois(9,40)) >result<-tapply(step4$CONVUNIT,INDEX=step4$TRIPID,FUN=sum) >result >111 222 333 >115 107 123 > >Is this what you wanted to do? I can't think of anything faster than >tapply for your problem. > >I hope this helps > >Francisco > > > > > >From: "Guenther, Cameron" <[EMAIL PROTECTED]> > >To: <[email protected]> > >Subject: [R] Unique? > >Date: Wed, 10 May 2006 17:02:33 -0400 > > > > > >Hello, > >I have sample data set that looks like: > > > >YEAR MONTH DAY CONTINUE SPL TIMEFISH > >TIMEUNIT AREA COUNTY DEPTH DEPUNIT GEAR TRIPID > >CONVUNIT > >1992 1 26 1 SP0073928 8 > >H 7 25 4 NA 1000000 > >02163399054 161 > >1992 1 26 1 SP0073928 8 > >H 7 25 4 NA 1000000 > >02163399054 8 > >1992 1 26 2 SP0004228 8 > >H 7 25 4 NA 1000000 > >02163399054 161 > >1992 1 26 2 SP0004228 8 > >H 7 25 4 NA 1000000 > >02163399054 8 > >1992 1 25 NA SP0052652 8 > >H 7 25 4 NA 1000000 > >02163399057 85 > >1992 1 26 NA SP0037940 8 > >H 7 25 4 NA 1000000 > >02163399058 70 > >1992 1 27 NA SP0072357 8 > >H 7 25 4 NA 1000000 > >02163399059 15 > >1992 1 27 NA SP0072357 8 > >H 7 25 4 NA 1000000 > >02163399059 20 > >1992 1 27 NA SP0026324 8 > >H 7 25 4 NA 1000000 > >02163399060 8 > >1992 1 28 1 SP0072357 8 > >H 7 25 4 NA 1000000 > >02163399062 200 > > > >How can I use unique to extract the rows that have repeated tripid's > >only, not a unique value for each variable but only for TRIPID. I then > > >want to condense the unique values by summing the CONVUNIT for each > >unique value of TRIPID. I posted a similar question last week and > >received a sufficient answer of how to do this without using uniqe. > >The solution below worked just fine on this sample data set but the > >full data set has 446,000 rows of data and my computer and R simply > >cannot handle this follwing code on data this large. > > > >conds<-by(Step4,Step4$TRIPID,function(x) > >replace(x[1,],"CONVUNIT",sum(x$CONVUNIT))) > >Step5<-do.call(rbind,conds) > > > >Thank you, > > > >Cameron Guenther, Ph.D. > >Associate Research Scientist > >FWC/FWRI, Marine Fisheries Research > >100 8th Avenue S.E. > >St. Petersburg, FL 33701 > >(727)896-8626 Ext. 4305 > >[EMAIL PROTECTED] > > > >______________________________________________ > >[email protected] mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide! > >http://www.R-project.org/posting-guide.html > > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
