Re: [R] subsetting by groups, with conditions

2009-12-29 Thread baptiste auguie
Hi,

I think you can also use plyr for this,

dft - read.table(textConnection(P1idVeg1Veg2AreaPoly2   P2ID

 1   p   p   1   1
 1   p   p   1.5 2
 2   p   p   2   3
 2   p   h   3.5 4), header=T)

library(plyr)

ddply(dft, .(P1id), function(.df) {
  .ddf - subset(.df, as.character(Veg1)==as.character(Veg2))
  .ddf[which.max(.ddf$AreaPoly2), ]
})

HTH,

baptiste

2009/12/29 Seth W Bigelow sbige...@fs.fed.us:
 I have a data set similar to this:

 P1id    Veg1    Veg2    AreaPoly2       P2ID
 1       p       p       1               1
 1       p       p       1.5             2
 2       p       p       2               3
 2       p       h       3.5             4

 For each group of Poly1id records, I wish to output (subset) the record
 which has largest AreaPoly2 value, but only if Veg1=Veg2. For this
 example, the desired dataset would be

 P1id    Veg1    Veg2    AreaPoly2       P2ID
 1       p       p       1.5             2
 2       p       p       2               3

 Can anyone point me in the right direction on this?

 Dr. Seth  W. Bigelow
 Biologist, USDA-FS Pacific Southwest Research Station
 1731 Research Park Drive, Davis California
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subsetting by groups, with conditions

2009-12-28 Thread Seth W Bigelow
I have a data set similar to this:

P1idVeg1Veg2AreaPoly2   P2ID
1   p   p   1   1
1   p   p   1.5 2
2   p   p   2   3
2   p   h   3.5 4

For each group of Poly1id records, I wish to output (subset) the record 
which has largest AreaPoly2 value, but only if Veg1=Veg2. For this 
example, the desired dataset would be

P1idVeg1Veg2AreaPoly2   P2ID
1   p   p   1.5 2
2   p   p   2   3
 
Can anyone point me in the right direction on this?

Dr. Seth  W. Bigelow
Biologist, USDA-FS Pacific Southwest Research Station
1731 Research Park Drive, Davis California
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting by groups, with conditions

2009-12-28 Thread jim holtman
try this:

 x - read.table(textConnection(P1idVeg1Veg2AreaPoly2
P2ID
+ 1   p   p   1   1
+ 1   p   p   1.5 2
+ 2   p   p   2   3
+ 2   p   h   3.5 4), header=TRUE, as.is=TRUE)
 # split the dataframe by P1id
 x.s - split(x, x$P1id)
 # now go through the sets to see which is the largest
 result - lapply(x.s, function(.sub){
+ .match - subset(.sub, Veg1 == Veg2)
+ if (length(.match)  0){
+ return(.match[which.max(.match$AreaPoly2),])
+ }
+ else {
+ return(NULL)
+ }
+ })
 do.call(rbind, result)
  P1id Veg1 Veg2 AreaPoly2 P2ID
11pp   1.52
22pp   2.03



On Mon, Dec 28, 2009 at 8:03 PM, Seth W Bigelow sbige...@fs.fed.us wrote:

 I have a data set similar to this:

 P1idVeg1Veg2AreaPoly2   P2ID
 1   p   p   1   1
 1   p   p   1.5 2
 2   p   p   2   3
 2   p   h   3.5 4

 For each group of Poly1id records, I wish to output (subset) the record
 which has largest AreaPoly2 value, but only if Veg1=Veg2. For this
 example, the desired dataset would be

 P1idVeg1Veg2AreaPoly2   P2ID
 1   p   p   1.5 2
 2   p   p   2   3

 Can anyone point me in the right direction on this?

 Dr. Seth  W. Bigelow
 Biologist, USDA-FS Pacific Southwest Research Station
 1731 Research Park Drive, Davis California
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting by groups, with conditions

2009-12-28 Thread Gabor Grothendieck
Assuming your data frame is called DF we can use sqldf like this.  The
inner select calculates the maximum AreaPoly2 for each group such that
Veg1 = Veg2 and the outer select returns the corresponding row.


library(sqldf)
sqldf(select * from DF a where AreaPoly2 =
  (select max(AreaPoly2) from DF where Veg1 = Veg2 and P1id = a.P1id))

Running it looks like this:

 library(sqldf)
 sqldf(select * from DF a where AreaPoly2 =
+   (select max(AreaPoly2) from DF where Veg1 = Veg2 and P1id = a.P1id))
  P1id Veg1 Veg2 AreaPoly2 P2ID
11pp   1.52
22pp   2.03


On Mon, Dec 28, 2009 at 8:03 PM, Seth W Bigelow sbige...@fs.fed.us wrote:
 I have a data set similar to this:

 P1id    Veg1    Veg2    AreaPoly2       P2ID
 1       p       p       1               1
 1       p       p       1.5             2
 2       p       p       2               3
 2       p       h       3.5             4

 For each group of Poly1id records, I wish to output (subset) the record
 which has largest AreaPoly2 value, but only if Veg1=Veg2. For this
 example, the desired dataset would be

 P1id    Veg1    Veg2    AreaPoly2       P2ID
 1       p       p       1.5             2
 2       p       p       2               3

 Can anyone point me in the right direction on this?

 Dr. Seth  W. Bigelow
 Biologist, USDA-FS Pacific Southwest Research Station
 1731 Research Park Drive, Davis California
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting by groups, with conditions

2009-12-28 Thread David Winsemius


On Dec 28, 2009, at 7:03 PM, Seth W Bigelow wrote:


I have a data set similar to this:

P1idVeg1Veg2AreaPoly2   P2ID
1   p   p   1   1
1   p   p   1.5 2
2   p   p   2   3
2   p   h   3.5 4

For each group of Poly1id records, I wish to output (subset) the  
record

which has largest AreaPoly2 value, but only if Veg1=Veg2. For this
example, the desired dataset would be

P1idVeg1Veg2AreaPoly2   P2ID
1   p   p   1.5 2
2   p   p   2   3


Can you be more expansive (or perhaps more accurate?) about the  
conditions you want satisfied? Looking at the that dataset, I only see  
one row that has the largest value for AreaPoly2 within the three  
records where Veg1==Veg2.


Otherwise I would think the answer might be along these lines:
 dft - read.table(textConnection(P1idVeg1Veg2 
AreaPoly2   P2ID

+ 1   p   p   1   1
+ 1   p   p   1.5 2
+ 2   p   p   2   3
+ 2   p   h   3.5 4), header=T)
 dft$Veg1 - factor(dft$Veg1, levels=levels(dft$Veg2))

 s.dft - subset(dft, Veg1==Veg2)

 s.dft[which.max(s.dft$AreaPoly2),]
  P1id Veg1 Veg2 AreaPoly2 P2ID
32pp 23

--
David



Can anyone point me in the right direction on this?

Dr. Seth  W. Bigelow
Biologist, USDA-FS Pacific Southwest Research Station
1731 Research Park Drive, Davis California
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.