[R] Removal/selecting specific rows in a dataframe conditional on 2 columns

2011-11-01 Thread Aurelie Cosandey Godin
Dear list, 

After reading different mails, blogs, and tried a few different codes without 
any success, I am asking your help!
I have the following data frame where each row represent a survey unit with the 
following variables:

 names(RV09)
 [1] record.t  trip  set   month stratum   NAFO 
 [7] unit.area time  dur.set   distance  operation mean.d   
[13] min.d max.d temp.dslat  slong spp  
[19] numberweightelat  elong

Each survey unit generates one set record, denoted by a 5 in column record.t. 
Each species identified in this particular survey unit generates an additional 
set record, denoted by a 6. 

 unique(RV09$record.t)
[1] 5 6

Each survey unit are identified by a specific trip and set number, so if 
there is a 5 record type with no associated 6 records, it means that no species 
were observed in that survey unit. I would like to be able to select all and 
only these survey units, which represent my zeros.

So as an exemple, in this trip number 913, set 1, 3, and 4 would be part of my 
zeros data.frame as they appear with no record.t 6, such that no species were 
observed in this survey unit.

 head(RV09)
   record.t trip set month stratum NAFO unit.area time dur.set distance
5855  913   110 351   3O   R31 1044  179
5865  913   210 351   3O   R31 1440  179
5876  913   210 351   3O   R31 1440  179
5885  913   310 340   3O   Q31 1800  189
5895  913   410 340   3O   Q32 2142  179

Any tips on how extract this zero data.frame in R? 
Thank you very much in advance!

Best,
~Aurelie


Aurelie Cosandey-Godin
Ph.D. student, Department of Biology
Industrial Graduate Fellow, WWF-Canada
Dalhousie University | Email: god...@dal.ca


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removal/selecting specific rows in a dataframe conditional on 2 columns

2011-11-01 Thread R. Michael Weylandt
Perhaps use tapply() to split by the survey unit and write a little
identity function that returns only those rows you want, then patch
them all back together with something like simplify2array().

Michael

On Tue, Nov 1, 2011 at 1:16 PM, Aurelie Cosandey Godin god...@dal.ca wrote:
 Dear list,

 After reading different mails, blogs, and tried a few different codes without 
 any success, I am asking your help!
 I have the following data frame where each row represent a survey unit with 
 the following variables:

 names(RV09)
  [1] record.t  trip      set       month     stratum   NAFO
  [7] unit.area time      dur.set   distance  operation mean.d
 [13] min.d     max.d     temp.d    slat      slong     spp
 [19] number    weight    elat      elong

 Each survey unit generates one set record, denoted by a 5 in column 
 record.t. Each species identified in this particular survey unit generates 
 an additional set record, denoted by a 6.

 unique(RV09$record.t)
 [1] 5 6

 Each survey unit are identified by a specific trip and set number, so if 
 there is a 5 record type with no associated 6 records, it means that no 
 species were observed in that survey unit. I would like to be able to select 
 all and only these survey units, which represent my zeros.

 So as an exemple, in this trip number 913, set 1, 3, and 4 would be part of 
 my zeros data.frame as they appear with no record.t 6, such that no species 
 were observed in this survey unit.

 head(RV09)
   record.t trip set month stratum NAFO unit.area time dur.set distance
 585        5  913   1    10     351   3O       R31 1044      17        9
 586        5  913   2    10     351   3O       R31 1440      17        9
 587        6  913   2    10     351   3O       R31 1440      17        9
 588        5  913   3    10     340   3O       Q31 1800      18        9
 589        5  913   4    10     340   3O       Q32 2142      17        9

 Any tips on how extract this zero data.frame in R?
 Thank you very much in advance!

 Best,
 ~Aurelie


 Aurelie Cosandey-Godin
 Ph.D. student, Department of Biology
 Industrial Graduate Fellow, WWF-Canada
 Dalhousie University | Email: god...@dal.ca


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removal/selecting specific rows in a dataframe conditional on 2 columns

2011-11-01 Thread Dennis Murphy
Does this work?

library('plyr')
# Function to return a data frame if it has one row, else return NULL:
f - function(d) if(nrow(d) == 1L) d else NULL
 ddply(RV09, .(set, month), f)
  record.t trip set month stratum NAFO unit.area time dur.set distance
15  913   110 351   3O   R31 1044  179
25  913   310 340   3O   Q31 1800  189
35  913   410 340   3O   Q32 2142  179

ddply() is an apply-like function that takes a data frame as input and
a data frame as output (hence the dd). The first argument is the data
frame name, the second argument the set of grouping variables and the
third is the function to be called (in this application).

HTH,
Dennis

On Tue, Nov 1, 2011 at 10:16 AM, Aurelie Cosandey Godin god...@dal.ca wrote:
 Dear list,

 After reading different mails, blogs, and tried a few different codes without 
 any success, I am asking your help!
 I have the following data frame where each row represent a survey unit with 
 the following variables:

 names(RV09)
  [1] record.t  trip      set       month     stratum   NAFO
  [7] unit.area time      dur.set   distance  operation mean.d
 [13] min.d     max.d     temp.d    slat      slong     spp
 [19] number    weight    elat      elong

 Each survey unit generates one set record, denoted by a 5 in column 
 record.t. Each species identified in this particular survey unit generates 
 an additional set record, denoted by a 6.

 unique(RV09$record.t)
 [1] 5 6

 Each survey unit are identified by a specific trip and set number, so if 
 there is a 5 record type with no associated 6 records, it means that no 
 species were observed in that survey unit. I would like to be able to select 
 all and only these survey units, which represent my zeros.

 So as an exemple, in this trip number 913, set 1, 3, and 4 would be part of 
 my zeros data.frame as they appear with no record.t 6, such that no species 
 were observed in this survey unit.

 head(RV09)
   record.t trip set month stratum NAFO unit.area time dur.set distance
 585        5  913   1    10     351   3O       R31 1044      17        9
 586        5  913   2    10     351   3O       R31 1440      17        9
 587        6  913   2    10     351   3O       R31 1440      17        9
 588        5  913   3    10     340   3O       Q31 1800      18        9
 589        5  913   4    10     340   3O       Q32 2142      17        9

 Any tips on how extract this zero data.frame in R?
 Thank you very much in advance!

 Best,
 ~Aurelie


 Aurelie Cosandey-Godin
 Ph.D. student, Department of Biology
 Industrial Graduate Fellow, WWF-Canada
 Dalhousie University | Email: god...@dal.ca


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.