[R] Removal/selecting specific rows in a dataframe conditional on 2 columns
Dear list, After reading different mails, blogs, and tried a few different codes without any success, I am asking your help! I have the following data frame where each row represent a survey unit with the following variables: names(RV09) [1] record.t trip set month stratum NAFO [7] unit.area time dur.set distance operation mean.d [13] min.d max.d temp.dslat slong spp [19] numberweightelat elong Each survey unit generates one set record, denoted by a 5 in column record.t. Each species identified in this particular survey unit generates an additional set record, denoted by a 6. unique(RV09$record.t) [1] 5 6 Each survey unit are identified by a specific trip and set number, so if there is a 5 record type with no associated 6 records, it means that no species were observed in that survey unit. I would like to be able to select all and only these survey units, which represent my zeros. So as an exemple, in this trip number 913, set 1, 3, and 4 would be part of my zeros data.frame as they appear with no record.t 6, such that no species were observed in this survey unit. head(RV09) record.t trip set month stratum NAFO unit.area time dur.set distance 5855 913 110 351 3O R31 1044 179 5865 913 210 351 3O R31 1440 179 5876 913 210 351 3O R31 1440 179 5885 913 310 340 3O Q31 1800 189 5895 913 410 340 3O Q32 2142 179 Any tips on how extract this zero data.frame in R? Thank you very much in advance! Best, ~Aurelie Aurelie Cosandey-Godin Ph.D. student, Department of Biology Industrial Graduate Fellow, WWF-Canada Dalhousie University | Email: god...@dal.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removal/selecting specific rows in a dataframe conditional on 2 columns
Perhaps use tapply() to split by the survey unit and write a little identity function that returns only those rows you want, then patch them all back together with something like simplify2array(). Michael On Tue, Nov 1, 2011 at 1:16 PM, Aurelie Cosandey Godin god...@dal.ca wrote: Dear list, After reading different mails, blogs, and tried a few different codes without any success, I am asking your help! I have the following data frame where each row represent a survey unit with the following variables: names(RV09) [1] record.t trip set month stratum NAFO [7] unit.area time dur.set distance operation mean.d [13] min.d max.d temp.d slat slong spp [19] number weight elat elong Each survey unit generates one set record, denoted by a 5 in column record.t. Each species identified in this particular survey unit generates an additional set record, denoted by a 6. unique(RV09$record.t) [1] 5 6 Each survey unit are identified by a specific trip and set number, so if there is a 5 record type with no associated 6 records, it means that no species were observed in that survey unit. I would like to be able to select all and only these survey units, which represent my zeros. So as an exemple, in this trip number 913, set 1, 3, and 4 would be part of my zeros data.frame as they appear with no record.t 6, such that no species were observed in this survey unit. head(RV09) record.t trip set month stratum NAFO unit.area time dur.set distance 585 5 913 1 10 351 3O R31 1044 17 9 586 5 913 2 10 351 3O R31 1440 17 9 587 6 913 2 10 351 3O R31 1440 17 9 588 5 913 3 10 340 3O Q31 1800 18 9 589 5 913 4 10 340 3O Q32 2142 17 9 Any tips on how extract this zero data.frame in R? Thank you very much in advance! Best, ~Aurelie Aurelie Cosandey-Godin Ph.D. student, Department of Biology Industrial Graduate Fellow, WWF-Canada Dalhousie University | Email: god...@dal.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removal/selecting specific rows in a dataframe conditional on 2 columns
Does this work? library('plyr') # Function to return a data frame if it has one row, else return NULL: f - function(d) if(nrow(d) == 1L) d else NULL ddply(RV09, .(set, month), f) record.t trip set month stratum NAFO unit.area time dur.set distance 15 913 110 351 3O R31 1044 179 25 913 310 340 3O Q31 1800 189 35 913 410 340 3O Q32 2142 179 ddply() is an apply-like function that takes a data frame as input and a data frame as output (hence the dd). The first argument is the data frame name, the second argument the set of grouping variables and the third is the function to be called (in this application). HTH, Dennis On Tue, Nov 1, 2011 at 10:16 AM, Aurelie Cosandey Godin god...@dal.ca wrote: Dear list, After reading different mails, blogs, and tried a few different codes without any success, I am asking your help! I have the following data frame where each row represent a survey unit with the following variables: names(RV09) [1] record.t trip set month stratum NAFO [7] unit.area time dur.set distance operation mean.d [13] min.d max.d temp.d slat slong spp [19] number weight elat elong Each survey unit generates one set record, denoted by a 5 in column record.t. Each species identified in this particular survey unit generates an additional set record, denoted by a 6. unique(RV09$record.t) [1] 5 6 Each survey unit are identified by a specific trip and set number, so if there is a 5 record type with no associated 6 records, it means that no species were observed in that survey unit. I would like to be able to select all and only these survey units, which represent my zeros. So as an exemple, in this trip number 913, set 1, 3, and 4 would be part of my zeros data.frame as they appear with no record.t 6, such that no species were observed in this survey unit. head(RV09) record.t trip set month stratum NAFO unit.area time dur.set distance 585 5 913 1 10 351 3O R31 1044 17 9 586 5 913 2 10 351 3O R31 1440 17 9 587 6 913 2 10 351 3O R31 1440 17 9 588 5 913 3 10 340 3O Q31 1800 18 9 589 5 913 4 10 340 3O Q32 2142 17 9 Any tips on how extract this zero data.frame in R? Thank you very much in advance! Best, ~Aurelie Aurelie Cosandey-Godin Ph.D. student, Department of Biology Industrial Graduate Fellow, WWF-Canada Dalhousie University | Email: god...@dal.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.