Re: [R] Extracting matched expressions
Thanks Jim - it's not elegant, but it works. Instead of using space as a delimiter, I used \u001E - it's the unicode record delimiter character, and I figure there's less chance of a clash with a character in the match. Hadley On Sun, Nov 8, 2009 at 1:40 PM, jim holtman jholt...@gmail.com wrote: Is this what you want: x - ' one two three ' y - sub(.*?([^[:space:]]+)[[:space:]]+([^[:space:]]+)[[:space:]]+([ehrt]{5}).*, + \\1 \\2 \\3, x, perl=TRUE) unlist(strsplit(y, ' ')) [1] one two three On Sun, Nov 8, 2009 at 1:51 PM, Hadley Wickham had...@rice.edu wrote: Hi all, Is there a tool in base R to extract matched expressions from a regular expression? i.e. given the regular expression (.*?) (.*?) ([ehtr]{5}) is there a way to extract the character vector c(one, two, three) from the string one two three ? Thanks, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting matched expressions
Is this what you want: x - ' one two three ' y - sub(.*?([^[:space:]]+)[[:space:]]+([^[:space:]]+)[[:space:]]+([ehrt]{5}).*, + \\1 \\2 \\3, x, perl=TRUE) unlist(strsplit(y, ' ')) [1] one two three On Sun, Nov 8, 2009 at 1:51 PM, Hadley Wickham had...@rice.edu wrote: Hi all, Is there a tool in base R to extract matched expressions from a regular expression? i.e. given the regular expression (.*?) (.*?) ([ehtr]{5}) is there a way to extract the character vector c(one, two, three) from the string one two three ? Thanks, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting matched expressions
strapply in the gsubfn package can do that. It applies the indicated function, here just c, to the back references from the pattern match and then simplifies the result using simplify. (If you omit simplify here it would give a one element list like strsplit does.) library(gsubfn) pat - (.*?) (.*?) ([ehtr]{5}) strapply(one two three, pat, c, simplify = c) See home page at: http://gsubfn.googlecode.com On Sun, Nov 8, 2009 at 1:51 PM, Hadley Wickham had...@rice.edu wrote: Hi all, Is there a tool in base R to extract matched expressions from a regular expression? i.e. given the regular expression (.*?) (.*?) ([ehtr]{5}) is there a way to extract the character vector c(one, two, three) from the string one two three ? Thanks, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.