[R] search for string insider a string
Hi, sorry if it is a too stupid question, but how do I a string search in R: I have a dataframe A with A$test like: test1 bcdtestblabla2.1bla cdtestblablabla3.88blabla and I want to search for string that start with 'dtest' and ends with number and return the location of that substring and the number, so the end result would be: NANA 32.1 23.88 I find grep can probably do this but I am new to the function so would like a good example. Thanks, Richard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] search for string insider a string
Try this. We use regexpr to get the positions and strapply puts the values in list s. The unlist statement converts NULL to NA and simplifies the list, s, to a numeric vector. For more info on strapply see http://gsubfn.googlecode.com library(gsubfn) # strapply x - ctest1, bcdtestblabla2.1bla, cdtestblablabla3.88blabla) dtest.info - cbind(posn = regexpr(dtest, x), value = { s - strapply(x, dtest[^0-9]*([0-9][0-9.]*), as.numeric) unlist(ifelse(sapply(s, length), s, NA)) }) # the above may be sufficient but # if its important to NA out rows with no match add dtest.info[dtest.info[,1] 0,] - NA dtest.info pos value [1,] NANA [2,] 3 2.10 [3,] 2 3.88 Why do you want the position? Is there a further transformation needed? What is it? There may be even easier approaches to the entire problem. On Fri, Mar 13, 2009 at 12:25 PM, Tan, Richard r...@panagora.com wrote: Hi, sorry if it is a too stupid question, but how do I a string search in R: I have a dataframe A with A$test like: test1 bcdtestblabla2.1bla cdtestblablabla3.88blabla and I want to search for string that start with 'dtest' and ends with number and return the location of that substring and the number, so the end result would be: NA NA 3 2.1 2 3.88 I find grep can probably do this but I am new to the function so would like a good example. Thanks, Richard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] search for string insider a string
That works. I want the position just for the purpose of my later manual check. Thanks a lot Gabor. -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Friday, March 13, 2009 2:18 PM To: Tan, Richard Cc: r-help@r-project.org Subject: Re: [R] search for string insider a string Try this. We use regexpr to get the positions and strapply puts the values in list s. The unlist statement converts NULL to NA and simplifies the list, s, to a numeric vector. For more info on strapply see http://gsubfn.googlecode.com library(gsubfn) # strapply x - ctest1, bcdtestblabla2.1bla, cdtestblablabla3.88blabla) dtest.info - cbind(posn = regexpr(dtest, x), value = { s - strapply(x, dtest[^0-9]*([0-9][0-9.]*), as.numeric) unlist(ifelse(sapply(s, length), s, NA)) }) # the above may be sufficient but # if its important to NA out rows with no match add dtest.info[dtest.info[,1] 0,] - NA dtest.info pos value [1,] NANA [2,] 3 2.10 [3,] 2 3.88 Why do you want the position? Is there a further transformation needed? What is it? There may be even easier approaches to the entire problem. On Fri, Mar 13, 2009 at 12:25 PM, Tan, Richard r...@panagora.com wrote: Hi, sorry if it is a too stupid question, but how do I a string search in R: I have a dataframe A with A$test like: test1 bcdtestblabla2.1bla cdtestblablabla3.88blabla and I want to search for string that start with 'dtest' and ends with number and return the location of that substring and the number, so the end result would be: NA NA 3 2.1 2 3.88 I find grep can probably do this but I am new to the function so would like a good example. Thanks, Richard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] search for string insider a string
That might be done by splitting the string into the portion before dtest, the portion from dtest to the number but not including it, the number and the rest. The s- line splits it up into a list and the next line reforms it into a character matrix replacing NULL list items with NA: library(gsubfn) # x from prior post s - strapply(x, (.*)(dtest[^0-9]*)([0-9][0-9.]*)(.*), c) do.call(rbind, sapply(s, function(x) if (is.null(x)) NA else x)) [,1] [,2] [,3] [,4] [1,] NA NA NA NA [2,] bc dtestblabla2.1 bla [3,] c dtestblablabla 3.88 blabla On Fri, Mar 13, 2009 at 3:10 PM, Tan, Richard r...@panagora.com wrote: That works. I want the position just for the purpose of my later manual check. Thanks a lot Gabor. -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Friday, March 13, 2009 2:18 PM To: Tan, Richard Cc: r-help@r-project.org Subject: Re: [R] search for string insider a string Try this. We use regexpr to get the positions and strapply puts the values in list s. The unlist statement converts NULL to NA and simplifies the list, s, to a numeric vector. For more info on strapply see http://gsubfn.googlecode.com library(gsubfn) # strapply x - ctest1, bcdtestblabla2.1bla, cdtestblablabla3.88blabla) dtest.info - cbind(posn = regexpr(dtest, x), value = { s - strapply(x, dtest[^0-9]*([0-9][0-9.]*), as.numeric) unlist(ifelse(sapply(s, length), s, NA)) }) # the above may be sufficient but # if its important to NA out rows with no match add dtest.info[dtest.info[,1] 0,] - NA dtest.info pos value [1,] NA NA [2,] 3 2.10 [3,] 2 3.88 Why do you want the position? Is there a further transformation needed? What is it? There may be even easier approaches to the entire problem. On Fri, Mar 13, 2009 at 12:25 PM, Tan, Richard r...@panagora.com wrote: Hi, sorry if it is a too stupid question, but how do I a string search in R: I have a dataframe A with A$test like: test1 bcdtestblabla2.1bla cdtestblablabla3.88blabla and I want to search for string that start with 'dtest' and ends with number and return the location of that substring and the number, so the end result would be: NA NA 3 2.1 2 3.88 I find grep can probably do this but I am new to the function so would like a good example. Thanks, Richard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.