[R] search for string insider a string

2009-03-13 Thread Tan, Richard
Hi, sorry if it is a too stupid question, but how do I a string search
in R:
 
I have a dataframe A with A$test like:
 
test1
bcdtestblabla2.1bla
cdtestblablabla3.88blabla
 
and I want to search for string that start with 'dtest' and ends with
number and return the location of that substring and the number, so the
end result would be:
 
NANA
32.1
23.88
 
I find grep can probably do this but I am new to the function so would
like a good example.
 
Thanks,
Richard
 
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] search for string insider a string

2009-03-13 Thread Gabor Grothendieck
Try this.  We use regexpr to get the positions and
strapply puts the values in list s.  The unlist statement
converts NULL to NA and simplifies the list, s, to a
numeric vector.  For more info on strapply see
http://gsubfn.googlecode.com

library(gsubfn)  # strapply

x - ctest1, bcdtestblabla2.1bla, cdtestblablabla3.88blabla)

dtest.info - cbind(posn = regexpr(dtest, x),
   value = { s - strapply(x, dtest[^0-9]*([0-9][0-9.]*), as.numeric)
unlist(ifelse(sapply(s, length), s, NA))
})

 # the above may be sufficient but
 # if its important to NA out rows with no match add
 dtest.info[dtest.info[,1]  0,] - NA
 dtest.info
 pos value
[1,]  NANA
[2,]   3  2.10
[3,]   2  3.88

Why do you want the position?   Is there a further transformation needed?
What is it?  There may be even easier approaches to the entire problem.

On Fri, Mar 13, 2009 at 12:25 PM, Tan, Richard r...@panagora.com wrote:
 Hi, sorry if it is a too stupid question, but how do I a string search
 in R:

 I have a dataframe A with A$test like:

 test1
 bcdtestblabla2.1bla
 cdtestblablabla3.88blabla

 and I want to search for string that start with 'dtest' and ends with
 number and return the location of that substring and the number, so the
 end result would be:

 NA    NA
 3    2.1
 2    3.88

 I find grep can probably do this but I am new to the function so would
 like a good example.

 Thanks,
 Richard



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] search for string insider a string

2009-03-13 Thread Tan, Richard
That works.  I want the position just for the purpose of my later manual check. 
 Thanks a lot Gabor.

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Friday, March 13, 2009 2:18 PM
To: Tan, Richard
Cc: r-help@r-project.org
Subject: Re: [R] search for string insider a string

Try this.  We use regexpr to get the positions and strapply puts the values in 
list s.  The unlist statement converts NULL to NA and simplifies the list, s, 
to a numeric vector.  For more info on strapply see http://gsubfn.googlecode.com

library(gsubfn)  # strapply

x - ctest1, bcdtestblabla2.1bla, cdtestblablabla3.88blabla)

dtest.info - cbind(posn = regexpr(dtest, x),
   value = { s - strapply(x, dtest[^0-9]*([0-9][0-9.]*), as.numeric)
unlist(ifelse(sapply(s, length), s, NA))
})

 # the above may be sufficient but
 # if its important to NA out rows with no match add 
 dtest.info[dtest.info[,1]  0,] - NA dtest.info
 pos value
[1,]  NANA
[2,]   3  2.10
[3,]   2  3.88

Why do you want the position?   Is there a further transformation needed?
What is it?  There may be even easier approaches to the entire problem.

On Fri, Mar 13, 2009 at 12:25 PM, Tan, Richard r...@panagora.com wrote:
 Hi, sorry if it is a too stupid question, but how do I a string search 
 in R:

 I have a dataframe A with A$test like:

 test1
 bcdtestblabla2.1bla
 cdtestblablabla3.88blabla

 and I want to search for string that start with 'dtest' and ends with 
 number and return the location of that substring and the number, so 
 the end result would be:

 NA    NA
 3    2.1
 2    3.88

 I find grep can probably do this but I am new to the function so would 
 like a good example.

 Thanks,
 Richard



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] search for string insider a string

2009-03-13 Thread Gabor Grothendieck
That might be done by splitting the string into the portion before
dtest, the portion from dtest to the number but not including it,
the number and the rest. The s- line splits it up into a list and
the next line reforms it into a character matrix replacing NULL
list items with NA:

 library(gsubfn)
 # x from prior post
 s - strapply(x, (.*)(dtest[^0-9]*)([0-9][0-9.]*)(.*), c)
 do.call(rbind, sapply(s, function(x) if (is.null(x)) NA else x))
 [,1] [,2] [,3]   [,4]
[1,] NA   NA   NA NA
[2,] bc dtestblabla2.1  bla
[3,] c  dtestblablabla 3.88 blabla


On Fri, Mar 13, 2009 at 3:10 PM, Tan, Richard r...@panagora.com wrote:
 That works.  I want the position just for the purpose of my later manual 
 check.  Thanks a lot Gabor.

 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Friday, March 13, 2009 2:18 PM
 To: Tan, Richard
 Cc: r-help@r-project.org
 Subject: Re: [R] search for string insider a string

 Try this.  We use regexpr to get the positions and strapply puts the values 
 in list s.  The unlist statement converts NULL to NA and simplifies the list, 
 s, to a numeric vector.  For more info on strapply see 
 http://gsubfn.googlecode.com

 library(gsubfn)  # strapply

 x - ctest1, bcdtestblabla2.1bla, cdtestblablabla3.88blabla)

 dtest.info - cbind(posn = regexpr(dtest, x),
   value = { s - strapply(x, dtest[^0-9]*([0-9][0-9.]*), as.numeric)
                unlist(ifelse(sapply(s, length), s, NA))
 })

 # the above may be sufficient but
 # if its important to NA out rows with no match add
 dtest.info[dtest.info[,1]  0,] - NA dtest.info
     pos value
 [1,]  NA    NA
 [2,]   3  2.10
 [3,]   2  3.88

 Why do you want the position?   Is there a further transformation needed?
 What is it?  There may be even easier approaches to the entire problem.

 On Fri, Mar 13, 2009 at 12:25 PM, Tan, Richard r...@panagora.com wrote:
 Hi, sorry if it is a too stupid question, but how do I a string search
 in R:

 I have a dataframe A with A$test like:

 test1
 bcdtestblabla2.1bla
 cdtestblablabla3.88blabla

 and I want to search for string that start with 'dtest' and ends with
 number and return the location of that substring and the number, so
 the end result would be:

 NA    NA
 3    2.1
 2    3.88

 I find grep can probably do this but I am new to the function so would
 like a good example.

 Thanks,
 Richard



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.