Re: [R] Subsetting a data frame
does this do what you want: > db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, + 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, + 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class = + "data.frame", row.names = c(NA, + -4L)) > > terms_include <- c("1","2","3") > terms_exclude <- c("1.1","1.2","1.3") > > f.match <- function(obj, inc, exc){ + pat <- paste("^(", paste(inc, collapse = "|"), ")", sep = '') + patex <- paste(exc, collapse = "|") + isMatch <- apply(obj, 1, function(x) any(grepl(pat, x))) + notMatch <- !apply(obj, 1, function(x) any(grepl(patex, x))) + obj[isMatch & notMatch,] + } > > db ind test1 test2 test3 1 ind1 1.056 1.1 2 ind2 2.027 28.0 3 ind3 1.358 9.0 4 ind4 3.0 2 1.2 > f.match(db, terms_include, terms_exclude) ind test1 test2 test3 2 ind2 22728 > On Mon, Dec 5, 2011 at 6:32 AM, natalie.vanzuydam wrote: > Hi R users, > > I really need help with subsetting data frames: > > I have a large database of medical records and I want to be able to match > patterns from a list of search terms . > > I've used this simplified data frame in a previous example: > > > db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, > 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, > 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class = > "data.frame", row.names = c(NA, > -4L)) > > terms_include <- c("1","2","3") > terms_exclude <- c("1.1","1.2","1.3") > > > So in this example I want to include all the terms from terms include as > long as they don't occur with terms exclude in the same row of the data > frame. > > Previously I was given this function which works very well if you want to > match exactly: > > > f <- function(x) !any(x %in% terms_exclude) && any(x %in% terms_include) > db[apply(db[, -1], 1, f), ] > > ind test1 test2 test3 > 2 ind2 2 27 28.0 > 4 ind4 3 2 1.2 > > > I would like to know if there is a way to write a similar function that > looks for matches that start with the query string: as in > grepl("^pattern",x) > > I started writing a function but am not sure how to get it to return the > dataframe or matrix: > > > for (i in 1:length(terms_include)){ > db_new <- apply(db,2, grepl,pattern=i) > } > > Applying this function gives me: > > db_new <- structure(c(FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, > FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), .Dim = c(4L, > 4L), .Dimnames = list(NULL, c("ind", "test1", "test2", "test3" > ))) > > So the above is searching the pattern anywhere in the dataframe instead of > just at the beginning of the string. > > How would I incorporate look for terms to include but don't return the row > of the data frame if it also includes one of the terms to exclude while > using partial matching? > > I hope that this makes sense. > > Many thanks, > Natalie > > - > Natalie Van Zuydam > > PhD Student > University of Dundee > nvanzuy...@dundee.ac.uk > -- > View this message in context: > http://r.789695.n4.nabble.com/Subsetting-a-data-frame-tp4160127p4160127.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting a data frame
Hi R users, I really need help with subsetting data frames: I have a large database of medical records and I want to be able to match patterns from a list of search terms . I've used this simplified data frame in a previous example: db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class = "data.frame", row.names = c(NA, -4L)) terms_include <- c("1","2","3") terms_exclude <- c("1.1","1.2","1.3") So in this example I want to include all the terms from terms include as long as they don't occur with terms exclude in the same row of the data frame. Previously I was given this function which works very well if you want to match exactly: f <- function(x) !any(x %in% terms_exclude) && any(x %in% terms_include) db[apply(db[, -1], 1, f), ] ind test1 test2 test3 2 ind2 227 28.0 4 ind4 3 2 1.2 I would like to know if there is a way to write a similar function that looks for matches that start with the query string: as in grepl("^pattern",x) I started writing a function but am not sure how to get it to return the dataframe or matrix: for (i in 1:length(terms_include)){ db_new <- apply(db,2, grepl,pattern=i) } Applying this function gives me: db_new <- structure(c(FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), .Dim = c(4L, 4L), .Dimnames = list(NULL, c("ind", "test1", "test2", "test3" ))) So the above is searching the pattern anywhere in the dataframe instead of just at the beginning of the string. How would I incorporate look for terms to include but don't return the row of the data frame if it also includes one of the terms to exclude while using partial matching? I hope that this makes sense. Many thanks, Natalie - Natalie Van Zuydam PhD Student University of Dundee nvanzuy...@dundee.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-tp4160127p4160127.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting a data frame with multiple values and exclusions.
Thanks. Such a short and sweet answer that does what it should. - Natalie Van Zuydam PhD Student University of Dundee nvanzuy...@dundee.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3877472.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting a data frame with multiple values and exclusions.
Hi: Is this what you're after? f <- function(x) !any(x %in% terms_exclude) && any(x %in% terms_include) db[apply(db[, -1], 1, f), ] ind test1 test2 test3 2 ind2 227 28.0 4 ind4 3 2 1.2 HTH, Dennis On Wed, Oct 5, 2011 at 8:53 AM, natalie.vanzuydam wrote: > Hi all, > > I realise that the convention is to provide a working example of my problem > but the data are of a sensitive nature so I'm not able to do that in this > case. > > I need to query a database for multiple search terms: > > db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, > 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, > 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class = > "data.frame", row.names = c(NA, > -4L)) > > terms_include <- c("1","2","3") > terms_exclude <- c("1.1","1.2","1.3") > > So I need to write a loop where the search of each value in the list of > terms_include is searched over the entire data frame. I thought of using > apply with grepl and subset? At the same time if the value of terms_include > occurs in the same row as values from terms_exclude then that row must be > excluded from the output dataframe. > > I'm not sure where to even begin. I've only worked very basically with > subset. The final database is much larger and the number of search terms is > many more than are presented here so I would really need to be able to loop > over the data frame successively to return a final df with my searched > values in at least one of the columns. > > Your help and assistance is much appreciated, > Natalie > > > > - > Natalie Van Zuydam > > PhD Student > University of Dundee > nvanzuy...@dundee.ac.uk > -- > View this message in context: > http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting a data frame with multiple values and exclusions.
Hi all, I realise that the convention is to provide a working example of my problem but the data are of a sensitive nature so I'm not able to do that in this case. I need to query a database for multiple search terms: db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class = "data.frame", row.names = c(NA, -4L)) terms_include <- c("1","2","3") terms_exclude <- c("1.1","1.2","1.3") So I need to write a loop where the search of each value in the list of terms_include is searched over the entire data frame. I thought of using apply with grepl and subset? At the same time if the value of terms_include occurs in the same row as values from terms_exclude then that row must be excluded from the output dataframe. I'm not sure where to even begin. I've only worked very basically with subset. The final database is much larger and the number of search terms is many more than are presented here so I would really need to be able to loop over the data frame successively to return a final df with my searched values in at least one of the columns. Your help and assistance is much appreciated, Natalie - Natalie Van Zuydam PhD Student University of Dundee nvanzuy...@dundee.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting a data frame by dropping correlated variables
The 'findCorrelation' function in the caret package may be helpful. On Tue, Apr 19, 2011 at 3:10 PM, Rita Carreira wrote: > > Hello R Users! > I have a data frame that has many variables, some with missing observations, > and some that are correlated with each other. I would like to subset the data > by dropping one of the variables that is correlated with another variable > that I will keep int he data frame. Alternatively, I could also drop both the > variables that are correlated with each other. Worry not! I am not deleting > data, I am just finding a subset of the data that I can use to impute some > missing observations. > I have tried the following statement > dfQuc <- dfQ[ , sapply(dfQ, function(x) cor(dfQ, use = > "pairwise.complete.obs", method ="pearson")<0.8)] > but it gives me the following error: > Error in `[.data.frame`(dfQ, , sapply(dfQ, function(x) cor(dfQ, use = > "pairwise.complete.obs", : > undefined columns selected > Since I have several dozen data frames, it is impractical for me to manually > inspect the correlation matrices and select which variables to drop, so I am > trying to have R make the selection for me. Does any one have any idea on how > to accomplish this? > Thank you very much! > Rita = "If you think education is > expensive, try ignorance."--Derek Bok > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting a data frame by dropping correlated variables
Hello R Users! I have a data frame that has many variables, some with missing observations, and some that are correlated with each other. I would like to subset the data by dropping one of the variables that is correlated with another variable that I will keep int he data frame. Alternatively, I could also drop both the variables that are correlated with each other. Worry not! I am not deleting data, I am just finding a subset of the data that I can use to impute some missing observations. I have tried the following statement dfQuc <- dfQ[ , sapply(dfQ, function(x) cor(dfQ, use = "pairwise.complete.obs", method ="pearson")<0.8)] but it gives me the following error: Error in `[.data.frame`(dfQ, , sapply(dfQ, function(x) cor(dfQ, use = "pairwise.complete.obs", : undefined columns selected Since I have several dozen data frames, it is impractical for me to manually inspect the correlation matrices and select which variables to drop, so I am trying to have R make the selection for me. Does any one have any idea on how to accomplish this? Thank you very much! Rita = "If you think education is expensive, try ignorance."--Derek Bok [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a data frame
Hi Joseph, Try this: # Data set DF=read.table(textConnection("V1 V2 V3 ab0:1:12 df1:2:1 cd1:0:9 be2:2:6 fc5:5:0"),header=TRUE) closeAllConnections() target=10 DF[sapply(strsplit(as.character(DF$V3), ":"), function(x) sum(as.numeric(x))== target), ] HTH, Jorge On Sat, Sep 6, 2008 at 2:25 PM, joseph <[EMAIL PROTECTED]> wrote: > Hi Jorge > I got the rows where V3 looks like this 10:10:10; Ithe sum here is 30 and > not 10. > I want the rows where the sum is 10 for exaple 5:5:0 and 2:2:6 > thanks > Joseph > > - Original Message > From: Jorge Ivan Velez <[EMAIL PROTECTED]> > To: joseph <[EMAIL PROTECTED]> > Sent: Saturday, September 6, 2008 10:43:09 AM > Subject: Re: [R] subsetting a data frame > > > Dear Joseph, > Try > > > DF[sapply(strsplit(as.character(DF$V3), ":"), >function(i) all(as.numeric(i) == 10)), ] > > HTH, > > Jorge > > > On Sat, Sep 6, 2008 at 1:24 PM, joseph <[EMAIL PROTECTED]> wrote: > >> Hello >> How can I change the function to get the rows with the sum (x+y+z) = 10? >> Thank you very much >> Joseph >> >> >> >> - Original Message ---- >> From: Marc Schwartz <[EMAIL PROTECTED]> >> To: joseph <[EMAIL PROTECTED]> >> Cc: r-help@r-project.org >> Sent: Wednesday, September 3, 2008 3:24:58 PM >> Subject: Re: [R] subsetting a data frame >> >> on 09/03/2008 05:06 PM joseph wrote: >> > I have a data frame that looks like this: >> > V1 V2 V3 >> > ab0:1:12 >> > df1:2:1 >> > cd1:0:9 >> > where V3 is in the form x:y:z >> > Can someone show me how to subset the rows where the values of x, y and >> z <= 10: >> > V1 V2 V3 >> > df1:2:1 >> > cd1:0:9 >> > Thanks >> > Joseph >> >> >> How about this: >> >> >> >> > DF[sapply(strsplit(as.character(DF$V3), ":"), >>function(i) all(as.numeric(i) <= 10)), ] >> V1 V2V3 >> 2 d f 1:2:1 >> 3 c d 1:0:9 >> >> >> Basically, use strsplit() to break apart 'V3': >> >> > strsplit(as.character(DF$V3), ":") >> [[1]] >> [1] "0" "1" "12" >> >> [[2]] >> [1] "1" "2" "1" >> >> [[3]] >> [1] "1" "0" "9" >> >> >> The use sapply() to crawl the list, converting the elements to numerics >> and do the value comparison: >> >> > sapply(strsplit(as.character(DF$V3), ":"), >> function(i) all(as.numeric(i) <= 10)) >> [1] FALSE TRUE TRUE >> >> >> The above then returns the logical vector to subset the rows of 'DF'. >> >> HTH, >> >> Marc Schwartz >> >> >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a data frame
Sorry I didn't read the problem carefully. On Sat, Sep 6, 2008 at 2:53 PM, Jorge Ivan Velez <[EMAIL PROTECTED]> wrote: > > Hi Joseph, > Try this: > # Data set > DF=read.table(textConnection("V1 V2 V3 > ab0:1:12 > df1:2:1 > cd1:0:9 > be2:2:6 > fc5:5:0"),header=TRUE) > closeAllConnections() > > target=10 > DF[sapply(strsplit(as.character(DF$V3), ":"), function(x) > sum(as.numeric(x))== target), ] > > HTH, > Jorge > > > On Sat, Sep 6, 2008 at 2:25 PM, joseph <[EMAIL PROTECTED]> wrote: >> >> Hi Jorge >> I got the rows where V3 looks like this 10:10:10; Ithe sum here is 30 >> and not 10. >> I want the rows where the sum is 10 for exaple 5:5:0 and 2:2:6 >> thanks >> Joseph >> >> - Original Message >> From: Jorge Ivan Velez <[EMAIL PROTECTED]> >> To: joseph <[EMAIL PROTECTED]> >> Sent: Saturday, September 6, 2008 10:43:09 AM >> Subject: Re: [R] subsetting a data frame >> >> >> Dear Joseph, >> Try >> >> DF[sapply(strsplit(as.character(DF$V3), ":"), >>function(i) all(as.numeric(i) == 10)), ] >> HTH, >> Jorge >> >> On Sat, Sep 6, 2008 at 1:24 PM, joseph <[EMAIL PROTECTED]> wrote: >>> >>> Hello >>> How can I change the function to get the rows with the sum (x+y+z) = 10? >>> Thank you very much >>> Joseph >>> >>> >>> >>> - Original Message >>> From: Marc Schwartz <[EMAIL PROTECTED]> >>> To: joseph <[EMAIL PROTECTED]> >>> Cc: r-help@r-project.org >>> Sent: Wednesday, September 3, 2008 3:24:58 PM >>> Subject: Re: [R] subsetting a data frame >>> >>> on 09/03/2008 05:06 PM joseph wrote: >>> > I have a data frame that looks like this: >>> > V1 V2 V3 >>> > ab0:1:12 >>> > df1:2:1 >>> > cd1:0:9 >>> > where V3 is in the form x:y:z >>> > Can someone show me how to subset the rows where the values of x, y and >>> > z <= 10: >>> > V1 V2 V3 >>> > df1:2:1 >>> > cd1:0:9 >>> > Thanks >>> > Joseph >>> >>> >>> How about this: >>> >>> >>> >>> > DF[sapply(strsplit(as.character(DF$V3), ":"), >>>function(i) all(as.numeric(i) <= 10)), ] >>> V1 V2V3 >>> 2 d f 1:2:1 >>> 3 c d 1:0:9 >>> >>> >>> Basically, use strsplit() to break apart 'V3': >>> >>> > strsplit(as.character(DF$V3), ":") >>> [[1]] >>> [1] "0" "1" "12" >>> >>> [[2]] >>> [1] "1" "2" "1" >>> >>> [[3]] >>> [1] "1" "0" "9" >>> >>> >>> The use sapply() to crawl the list, converting the elements to numerics >>> and do the value comparison: >>> >>> > sapply(strsplit(as.character(DF$V3), ":"), >>> function(i) all(as.numeric(i) <= 10)) >>> [1] FALSE TRUE TRUE >>> >>> >>> The above then returns the logical vector to subset the rows of 'DF'. >>> >>> HTH, >>> >>> Marc Schwartz >>> >>> >>> >>>[[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a data frame
#something like this? V1 <- c(1:10) V2 <- c(0,5,7,8,1,6,5,13,7,0) V3 <- c(9,5,6,8,1,7,5,33,88,0) z <- cbind(V1,V2,V3) row.sums <- rowSums(z) d <- cbind(z, row.sums) subset(d, row.sums==10) On Sat, Sep 6, 2008 at 2:25 PM, joseph <[EMAIL PROTECTED]> wrote: > Hi Jorge > I got the rows where V3 looks like this 10:10:10; Ithe sum here is 30 and > not 10. > I want the rows where the sum is 10 for exaple 5:5:0 and 2:2:6 > > thanks > Joseph > > > - Original Message > From: Jorge Ivan Velez <[EMAIL PROTECTED]> > To: joseph <[EMAIL PROTECTED]> > Sent: Saturday, September 6, 2008 10:43:09 AM > Subject: Re: [R] subsetting a data frame > > > > Dear Joseph, > > Try > > > DF[sapply(strsplit(as.character(DF$V3), ":"), > function(i) all(as.numeric(i) == 10)), ] > > HTH, > > Jorge > > > > On Sat, Sep 6, 2008 at 1:24 PM, joseph <[EMAIL PROTECTED]> wrote: > > Hello > How can I change the function to get the rows with the sum (x+y+z) = 10? > Thank you very much > Joseph > > > > > - Original Message > From: Marc Schwartz <[EMAIL PROTECTED]> > To: joseph <[EMAIL PROTECTED]> > Cc: r-help@r-project.org > Sent: Wednesday, September 3, 2008 3:24:58 PM > Subject: Re: [R] subsetting a data frame > > on 09/03/2008 05:06 PM joseph wrote: >> I have a data frame that looks like this: >> V1 V2 V3 >> ab0:1:12 >> df1:2:1 >> cd1:0:9 >> where V3 is in the form x:y:z >> Can someone show me how to subset the rows where the values of x, y and z <= >> 10: >> V1 V2 V3 >> df1:2:1 >> cd1:0:9 >> Thanks >> Joseph > > > How about this: > > > >> DF[sapply(strsplit(as.character(DF$V3), ":"), > function(i) all(as.numeric(i) <= 10)), ] > V1 V2V3 > 2 d f 1:2:1 > 3 c d 1:0:9 > > > Basically, use strsplit() to break apart 'V3': > >> strsplit(as.character(DF$V3), ":") > [[1]] > [1] "0" "1" "12" > > [[2]] > [1] "1" "2" "1" > > [[3]] > [1] "1" "0" "9" > > > The use sapply() to crawl the list, converting the elements to numerics > and do the value comparison: > >> sapply(strsplit(as.character(DF$V3), ":"), >function(i) all(as.numeric(i) <= 10)) > [1] FALSE TRUE TRUE > > > The above then returns the logical vector to subset the rows of 'DF'. > > HTH, > > Marc Schwartz > > > > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a data frame
Hi Jorge I got the rows where V3 looks like this 10:10:10; Ithe sum here is 30 and not 10. I want the rows where the sum is 10 for exaple 5:5:0 and 2:2:6 thanks Joseph - Original Message From: Jorge Ivan Velez <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Sent: Saturday, September 6, 2008 10:43:09 AM Subject: Re: [R] subsetting a data frame Dear Joseph, Try DF[sapply(strsplit(as.character(DF$V3), ":"), function(i) all(as.numeric(i) == 10)), ] HTH, Jorge On Sat, Sep 6, 2008 at 1:24 PM, joseph <[EMAIL PROTECTED]> wrote: Hello How can I change the function to get the rows with the sum (x+y+z) = 10? Thank you very much Joseph - Original Message From: Marc Schwartz <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Cc: r-help@r-project.org Sent: Wednesday, September 3, 2008 3:24:58 PM Subject: Re: [R] subsetting a data frame on 09/03/2008 05:06 PM joseph wrote: > I have a data frame that looks like this: > V1 V2 V3 > ab0:1:12 > df1:2:1 > cd1:0:9 > where V3 is in the form x:y:z > Can someone show me how to subset the rows where the values of x, y and z <= > 10: > V1 V2 V3 > df1:2:1 > cd1:0:9 > Thanks > Joseph How about this: > DF[sapply(strsplit(as.character(DF$V3), ":"), function(i) all(as.numeric(i) <= 10)), ] V1 V2V3 2 d f 1:2:1 3 c d 1:0:9 Basically, use strsplit() to break apart 'V3': > strsplit(as.character(DF$V3), ":") [[1]] [1] "0" "1" "12" [[2]] [1] "1" "2" "1" [[3]] [1] "1" "0" "9" The use sapply() to crawl the list, converting the elements to numerics and do the value comparison: > sapply(strsplit(as.character(DF$V3), ":"), function(i) all(as.numeric(i) <= 10)) [1] FALSE TRUE TRUE The above then returns the logical vector to subset the rows of 'DF'. HTH, Marc Schwartz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a data frame
Hello How can I change the function to get the rows with the sum (x+y+z) = 10? Thank you very much Joseph - Original Message From: Marc Schwartz <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Cc: r-help@r-project.org Sent: Wednesday, September 3, 2008 3:24:58 PM Subject: Re: [R] subsetting a data frame on 09/03/2008 05:06 PM joseph wrote: > I have a data frame that looks like this: > V1 V2 V3 > ab0:1:12 > df1:2:1 > cd1:0:9 > where V3 is in the form x:y:z > Can someone show me how to subset the rows where the values of x, y and z <= > 10: > V1 V2 V3 > df1:2:1 > cd1:0:9 > Thanks > Joseph How about this: > DF[sapply(strsplit(as.character(DF$V3), ":"), function(i) all(as.numeric(i) <= 10)), ] V1 V2V3 2 d f 1:2:1 3 c d 1:0:9 Basically, use strsplit() to break apart 'V3': > strsplit(as.character(DF$V3), ":") [[1]] [1] "0" "1" "12" [[2]] [1] "1" "2" "1" [[3]] [1] "1" "0" "9" The use sapply() to crawl the list, converting the elements to numerics and do the value comparison: > sapply(strsplit(as.character(DF$V3), ":"), function(i) all(as.numeric(i) <= 10)) [1] FALSE TRUE TRUE The above then returns the logical vector to subset the rows of 'DF'. HTH, Marc Schwartz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a data frame
on 09/03/2008 05:06 PM joseph wrote: > I have a data frame that looks like this: > V1 V2 V3 > ab0:1:12 > df1:2:1 > cd1:0:9 > where V3 is in the form x:y:z > Can someone show me how to subset the rows where the values of x, y and z <= > 10: > V1 V2 V3 > df1:2:1 > cd1:0:9 > Thanks > Joseph How about this: > DF[sapply(strsplit(as.character(DF$V3), ":"), function(i) all(as.numeric(i) <= 10)), ] V1 V2V3 2 d f 1:2:1 3 c d 1:0:9 Basically, use strsplit() to break apart 'V3': > strsplit(as.character(DF$V3), ":") [[1]] [1] "0" "1" "12" [[2]] [1] "1" "2" "1" [[3]] [1] "1" "0" "9" The use sapply() to crawl the list, converting the elements to numerics and do the value comparison: > sapply(strsplit(as.character(DF$V3), ":"), function(i) all(as.numeric(i) <= 10)) [1] FALSE TRUE TRUE The above then returns the logical vector to subset the rows of 'DF'. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subsetting a data frame
I have a data frame that looks like this: V1 V2 V3 ab0:1:12 df1:2:1 cd1:0:9 where V3 is in the form x:y:z Can someone show me how to subset the rows where the values of x, y and z <= 10: V1 V2 V3 df1:2:1 cd1:0:9 Thanks Joseph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a data frame using string matching
> a = c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta") > b = c(1:6) > example = data.frame("Title" = a, "Vals" = b) > > > > example > Title Vals > 1 Alpha1 > 2 Beta2 > 3 Gamma3 > 4 Beeta4 > 5 Alpha5 > 6 beta6 > > > > I would like to be able to get a new data frame from this data frame > containing only rows that match a certain string. In this case it > could for instance be the string "eta". I have tried various ways of > using agrep and grep, but so far I have not found anything that > worked. Sounds like you were nearly there. rows.to.keep <- grep("eta", example$Title) subdata <- example[rows.to.keep,] Regards, Richie. Mathematical Sciences Unit HSL "Statistics are like a lamp-post to a drunken man - more for leaning on than illumination." David Brent, The Office. ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a data frame using string matching
On 1/21/2008 5:18 AM, Karin Lagesen wrote: > Example data frame: > > > a = c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta") > b = c(1:6) > example = data.frame("Title" = a, "Vals" = b) > > >> example > Title Vals > 1 Alpha1 > 2 Beta2 > 3 Gamma3 > 4 Beeta4 > 5 Alpha5 > 6 beta6 > > I would like to be able to get a new data frame from this data frame > containing only rows that match a certain string. In this case it > could for instance be the string "eta". I have tried various ways of > using agrep and grep, but so far I have not found anything that > worked. > > Thankyou in advance for your help! a <- c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta") b <- c(1:6) df <- data.frame(Title = a, Vals = b) df[grep("eta", df$Title),] Title Vals 2 Beta2 4 Beeta4 6 beta6 > Karin -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subsetting a data frame using string matching
Example data frame: a = c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta") b = c(1:6) example = data.frame("Title" = a, "Vals" = b) > example Title Vals 1 Alpha1 2 Beta2 3 Gamma3 4 Beeta4 5 Alpha5 6 beta6 > I would like to be able to get a new data frame from this data frame containing only rows that match a certain string. In this case it could for instance be the string "eta". I have tried various ways of using agrep and grep, but so far I have not found anything that worked. Thankyou in advance for your help! Karin -- Karin Lagesen, PhD student [EMAIL PROTECTED] http://folk.uio.no/karinlag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.