Re: [R] Unique?

Francisco J. Zagmutt Thu, 11 May 2006 10:14:15 -0700

Hi Cameron

You need to be more specific when you ask a question so you can get a better 
answer.  Anyhow, when you say that you want to retain all the other 
variables do you mean that you want to create a new column in the dataset 
that contains the calculated sum?   If that is the case you can use a 
construction like:


set.seed(1)
step4<-data.frame(TRIPID=rep(c(111,222,333),3),CONVUNIT=rpois(9,40))
result<-tapply(step4$CONVUNIT,INDEX=step4$TRIPID,FUN=sum)
step4[,"SUM"]=result[match(step4[,"TRIPID"],names(result))]
step4
  TRIPID CONVUNIT Sum
1    111       36 122
2    222       48 121
3    333       48 129
4    111       42 122
5    222       30 121
6    333       43 129
7    111       44 122
8    222       43 121
9    333       38 129


Cheers

Francisco

>From: "Guenther, Cameron" <[EMAIL PROTECTED]>
>To: "Francisco J. Zagmutt" <[EMAIL PROTECTED]>
>Subject: RE: [R] Unique?
>Date: Thu, 11 May 2006 12:08:31 -0400
>
>It is close but not quite what I want.  I need to retain all of the
>other variables as well.
>
>
>Cameron Guenther, Ph.D.
>Associate Research Scientist
>FWC/FWRI, Marine Fisheries Research
>100 8th Avenue S.E.
>St. Petersburg, FL 33701
>(727)896-8626 Ext. 4305
>[EMAIL PROTECTED]
>-----Original Message-----
>From: Francisco J. Zagmutt [mailto:[EMAIL PROTECTED]
>Sent: Wednesday, May 10, 2006 6:06 PM
>To: Guenther, Cameron; [email protected]
>Subject: RE: [R] Unique?
>
>If you only care about the sum of CONVUNIT by each TRIPID then you can
>use tapply i.e.:
>
>step4<-data.frame(TRIPID=rep(c(111,222,333),3),CONVUNIT=rpois(9,40))
>result<-tapply(step4$CONVUNIT,INDEX=step4$TRIPID,FUN=sum)
>result
>111 222 333
>115 107 123
>
>Is this what you wanted to do?  I can't think of anything faster than
>tapply for your problem.
>
>I hope this helps
>
>Francisco
>
>
>
>
> >From: "Guenther, Cameron" <[EMAIL PROTECTED]>
> >To: <[email protected]>
> >Subject: [R] Unique?
> >Date: Wed, 10 May 2006 17:02:33 -0400
> >
> >
> >Hello,
> >I have sample data set that looks like:
> >
> >YEAR MONTH   DAY     CONTINUE        SPL             TIMEFISH
> >TIMEUNIT     AREA    COUNTY  DEPTH   DEPUNIT GEAR            TRIPID
> >CONVUNIT
> >1992 1       26      1               SP0073928       8
> >H            7       25              4       NA              1000000
> >02163399054  161
> >1992 1       26      1               SP0073928       8
> >H            7       25              4       NA              1000000
> >02163399054  8
> >1992 1       26      2               SP0004228       8
> >H            7       25              4       NA              1000000
> >02163399054  161
> >1992 1       26      2               SP0004228       8
> >H            7       25              4       NA              1000000
> >02163399054  8
> >1992 1       25      NA              SP0052652       8
> >H            7       25              4       NA              1000000
> >02163399057  85
> >1992 1       26      NA              SP0037940       8
> >H            7       25              4       NA              1000000
> >02163399058  70
> >1992 1       27      NA              SP0072357       8
> >H            7       25              4       NA              1000000
> >02163399059  15
> >1992 1       27      NA              SP0072357       8
> >H            7       25              4       NA              1000000
> >02163399059  20
> >1992 1       27      NA              SP0026324       8
> >H            7       25              4       NA              1000000
> >02163399060  8
> >1992 1       28      1               SP0072357       8
> >H            7       25              4       NA              1000000
> >02163399062  200
> >
> >How can I use unique to extract the rows that have repeated tripid's
> >only, not a unique value for each variable but only for TRIPID.  I then
>
> >want to condense the unique values by summing the CONVUNIT for each
> >unique value of TRIPID.  I posted a similar question last week and
> >received a sufficient answer of how to do this without using uniqe.
> >The solution below worked just fine on this sample data set but the
> >full data set has 446,000 rows of data and my computer and R simply
> >cannot handle this follwing code on data this large.
> >
> >conds<-by(Step4,Step4$TRIPID,function(x)
> >replace(x[1,],"CONVUNIT",sum(x$CONVUNIT)))
> >Step5<-do.call(rbind,conds)
> >
> >Thank you,
> >
> >Cameron Guenther, Ph.D.
> >Associate Research Scientist
> >FWC/FWRI, Marine Fisheries Research
> >100 8th Avenue S.E.
> >St. Petersburg, FL 33701
> >(727)896-8626 Ext. 4305
> >[EMAIL PROTECTED]
> >
> >______________________________________________
> >[email protected] mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide!
> >http://www.R-project.org/posting-guide.html
>
>

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Unique?

Reply via email to