[R] How to select specific rows from a data frame based on values
Dear Group: I am working with a data frame containing 316 rows of individuals with 79 variables. Each of these 79 variables have values that range between -4 to +4, and I want to subset this data frame so that in the resulting new dataframe, values of _all_ of these variables should range between -3 and +3. Let's say I have the following dataframe (it's a toy example with 4 individuals and 5 variables): subj1 - cbind(-4, -3, -1, -5, -7) subj2 - cbind(-2, -1, -1, -2, +2) subj3 - cbind(+2, +1, +2, +1, +2) subj4 - cbind(-4, -1, -2, +2, +1, +1) mydf - as.data.frame(rbind(subj1, subj2, subj3, subj4)) From mydf, I want to generate a new dataframe (let's call it mydf1) which will have records of only subj2 and subj3 in it since only these two individuals had all values for variables V1 through V5 in mydf to range between -3 and +3. Documentation on subsetting and indexing data frames did not help to solve this specific problem. There may be an obvious solution to it but I just cannot seem to get it. Would greatly appreciate your inputs. [relevant information: R-version: 2.4.1, running on Windows XP] /Arin Basu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to select specific rows from a data frame based on values
Try this: subj1 - cbind(-4, -3, -1, -5, -7) subj2 - cbind(-2, -1, -1, -2, +2) subj3 - cbind(+2, +1, +2, +1, +2) subj4 - cbind(-4, -1, -2, +2, +1) mydf - as.data.frame(rbind(subj1, subj2, subj3, subj4)) mydf V1 V2 V3 V4 V5 1 -4 -3 -1 -5 -7 2 -2 -1 -1 -2 2 3 2 1 2 1 2 4 -4 -1 -2 2 1 apply(mydf, 1, function(x)all(x-3) all(x 3)) [1] FALSE TRUE TRUE FALSE mydf[apply(mydf, 1, function(x)all(x-3) all(x 3)),] V1 V2 V3 V4 V5 2 -2 -1 -1 -2 2 3 2 1 2 1 2 On 5/17/07, Arin Basu [EMAIL PROTECTED] wrote: Dear Group: I am working with a data frame containing 316 rows of individuals with 79 variables. Each of these 79 variables have values that range between -4 to +4, and I want to subset this data frame so that in the resulting new dataframe, values of _all_ of these variables should range between -3 and +3. Let's say I have the following dataframe (it's a toy example with 4 individuals and 5 variables): subj1 - cbind(-4, -3, -1, -5, -7) subj2 - cbind(-2, -1, -1, -2, +2) subj3 - cbind(+2, +1, +2, +1, +2) subj4 - cbind(-4, -1, -2, +2, +1, +1) mydf - as.data.frame(rbind(subj1, subj2, subj3, subj4)) From mydf, I want to generate a new dataframe (let's call it mydf1) which will have records of only subj2 and subj3 in it since only these two individuals had all values for variables V1 through V5 in mydf to range between -3 and +3. Documentation on subsetting and indexing data frames did not help to solve this specific problem. There may be an obvious solution to it but I just cannot seem to get it. Would greatly appreciate your inputs. [relevant information: R-version: 2.4.1, running on Windows XP] /Arin Basu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to select specific rows from a data frame based on values
Arin Basu wrote: Dear Group: I am working with a data frame containing 316 rows of individuals with 79 variables. Each of these 79 variables have values that range between -4 to +4, and I want to subset this data frame so that in the resulting new dataframe, values of _all_ of these variables should range between -3 and +3. Let's say I have the following dataframe (it's a toy example with 4 individuals and 5 variables): subj1 - cbind(-4, -3, -1, -5, -7) subj2 - cbind(-2, -1, -1, -2, +2) subj3 - cbind(+2, +1, +2, +1, +2) subj4 - cbind(-4, -1, -2, +2, +1, +1) mydf - as.data.frame(rbind(subj1, subj2, subj3, subj4)) From mydf, I want to generate a new dataframe (let's call it mydf1) which will have records of only subj2 and subj3 in it since only these two individuals had all values for variables V1 through V5 in mydf to range between -3 and +3. Documentation on subsetting and indexing data frames did not help to solve this specific problem. There may be an obvious solution to it but I just cannot seem to get it. Would greatly appreciate your inputs. mydf1 - mydf[apply(mydf = -3 mydf = 3, MARGIN=1, FUN=all),] [relevant information: R-version: 2.4.1, running on Windows XP] /Arin Basu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.