[R] extracting a matched string using regexpr

2010-05-05 Thread steven mosher
Given a text like I want to be able to extract a matched regular expression from a piece of text. this apparently works, but is pretty ugly # some html test-/trtrth88958/ththAbcdsef/thth67.8S/thth68.9\nW/thth26m/th # a pattern to extract 5 digits pattern-[0-9]{5} # regexpr returns a start

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread Gabor Grothendieck
Here are two ways to extract 5 digits. In the first one \\1 refers to the portion matched between the parentheses in the regular expression. In the second one strapply is like apply where the object to be worked on is the first argument (array for apply, string for strapply) the second modifies

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread steven mosher
Thanks I was looking at that package and reading your mails in the archive. I think my tiny mind got twisted in the regexp.. On Wed, May 5, 2010 at 2:35 PM, Gabor Grothendieck ggrothendi...@gmail.comwrote: Here are two ways to extract 5 digits. In the first one \\1 refers to the portion

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread steven mosher
test [1] /trtrth88958/ththAbcdsef/thth67.8S/thth68.9\nW/thth26m/th sub(.*(\\d{5}).*, \\1, test) [1] /th sub(.*([0-9]{5}).*,\\1,test) [1] 88958 I think the / in the source throws something off. as the group capture appears to not be working, except the bracket version it did. On Wed, May

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread Gabor Grothendieck
That's not what I get: test-/trtrth88958/ththAbcdsef/thth67.8S/thth68.9\nW/thth26m/th sub(.*(\\d{5}).*, \\1, test) [1] 88958 R.version.string [1] R version 2.10.1 (2009-12-14) I also got the above in R 2.11.0 patched as well. On Wed, May 5, 2010 at 5:55 PM, steven mosher

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread David Winsemius
On May 5, 2010, at 5:35 PM, Gabor Grothendieck wrote: Here are two ways to extract 5 digits. In the first one \\1 refers to the portion matched between the parentheses in the regular expression. In the second one strapply is like apply where the object to be worked on is the first argument

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread steven mosher
Hmm. I have R11 just downloaded fresh. I'll reload a new session..and revert. I will note that I've had trouble with \\d which is why I was using [0-9] MAC here. On Wed, May 5, 2010 at 3:00 PM, Gabor Grothendieck ggrothendi...@gmail.comwrote: That's not what I get:

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread Gabor Grothendieck
I am using Vista. Another thing to try is strapply using the tcl engine (assuming you do have tcltk capabilities) and the R engine. On Vista R 2.11.0 patched I get the same result: capabilities()[[tcltk]] [1] TRUE strapply(test, \\d{5}, c, engine = tcl)[[1]] [1] 88958 strapply(test, \\d{5},

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread steven mosher
Thnks, perhaps we should report it On Wed, May 5, 2010 at 4:52 PM, Gabor Grothendieck ggrothendi...@gmail.comwrote: I am using Vista. Another thing to try is strapply using the tcl engine (assuming you do have tcltk capabilities) and the R engine. On Vista R 2.11.0 patched I get the same

Re: [R] extracting a matched string using regexpr

2010-05-05 Thread Gabor Grothendieck
Yes, you could bring it up on the R-sig-mac or file a bug report. On Wed, May 5, 2010 at 10:11 PM, steven mosher mosherste...@gmail.com wrote: Thnks, perhaps we should report it On Wed, May 5, 2010 at 4:52 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: I am using Vista.  Another