On Tue, Sep 28, 2010 at 6:52 AM, Titus von der Malsburg <malsb...@gmail.com> wrote: > On Tue, Sep 28, 2010 at 9:46 AM, Michael Bedward > <michael.bedw...@gmail.com> wrote: >> What Titus wants to do is akin to retrieving capturing groups from a >> Matcher object in Java. > > Precisely. Here's the description: > > http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Matcher.html#start(int) > > Gabor's lookbehind trick solves some special cases but it's not the
The only limitation is that in the regular expressions supported by R you cannot have repitition in the (<=...) portion but none of your examples -- neither the one you gave nor the one below require that since if the prior expression ends in X+ you can just use X. Are you sure it does not cover all your actual situations? If you truly do have situations where that require repetition a gregexpr plus gsubfn will do it in one line. Parenthesize the portion of the regular expression you want to capture and replace every character in it with X (or some other character that does not otherwise occur). Then find the positions and lengths of strings of X. > gregexpr("X+", gsubfn("a(b+)", ~ gsub(".", "X", x), "abcdaabbcbbb")) [[1]] [1] 1 5 attr(,"match.length") [1] 1 2 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.