[R] extracting a matched string using regexpr

steven mosher Wed, 05 May 2010 14:14:11 -0700

Given a text like

I want to be able to extract a matched regular expression from a piece of
text.


this apparently works, but is pretty ugly
# some html
test<-"</tr><tr><th>88958</th><th>Abcdsef</th><th>67.8S</th><th>68.9\nW</th><th>26m</th>"
# a pattern to extract 5 digits
> pattern<-"[0-9]{5}"
# regexpr returns a start point[1] and an attribute "match.length"
attr(,"match.length)
# get the substring from the start point to the stop point.. where stop =
start +length-1
>
answer<-substr(test,regexpr(pattern,test)[1],regexpr(pattern,test)[1]+attr(regexpr(pattern,test),"match.length")-1)
> answer
[1] "88958"

I tried using sub(pattern, replacement, x )  with a regexp that captured the
group. I'd found an example of this in the mails
but it didnt seem to work..

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extracting a matched string using regexpr

Reply via email to