Re: [R] Minimal match to regexp?

2023-01-26 Thread Duncan Murdoch
I'll submit a bug report. On 25/01/2023 8:38 p.m., Andrew Simmons wrote: It seems like a bug to me. Using perl = TRUE, I see the desired result: ``` x <- "\n```html\nblah blah \n```\n\n```r\nblah blah\n```\n" pattern2 <- "\n([`]{3,})html\n.*?\n\\1\n" cat(regmatches(x, regexpr(pattern2, x,

Re: [R] Minimal match to regexp?

2023-01-25 Thread Andrew Simmons
It seems like a bug to me. Using perl = TRUE, I see the desired result: ``` x <- "\n```html\nblah blah \n```\n\n```r\nblah blah\n```\n" pattern2 <- "\n([`]{3,})html\n.*?\n\\1\n" cat(regmatches(x, regexpr(pattern2, x, perl = TRUE))) ``` If you change it to something like: ``` x <- c(

Re: [R] Minimal match to regexp?

2023-01-25 Thread Duncan Murdoch
Thanks for pointing out my mistake. I oversimplified the real problem. I'll try to post a version of it that comes closer: Suppose I have a string like this: x <- "\n```html\nblah blah \n```\n\n```r\nblah blah\n```\n" If I cat() it, I see that it is really markdown source: ```html

Re: [R] Minimal match to regexp?

2023-01-25 Thread Jeff Newmiller
Perhaps sub( "^.*(a.*?a).*$", "\\1", x ) On January 25, 2023 4:19:01 PM PST, Duncan Murdoch wrote: >The docs for ?regexp say this: "By default repetition is greedy, so the >maximal possible number of repeats is used. This can be changed to ‘minimal’ >by appending ? to the quantifier.

Re: [R] Minimal match to regexp?

2023-01-25 Thread Duncan Murdoch
On 25/01/2023 7:19 p.m., Duncan Murdoch wrote: The docs for ?regexp say this: "By default repetition is greedy, so the maximal possible number of repeats is used. This can be changed to ‘minimal’ by appending ? to the quantifier. (There are further quantifiers that allow approximate matching:

Re: [R] Minimal match to regexp?

2023-01-25 Thread Andrew Simmons
grep(value = TRUE) just returns the strings which match the pattern. You have to use regexpr() or gregexpr() if you want to know where the matches are: ``` x <- "abaca" # extract only the first match with regexpr() m <- regexpr("a.*?a", x) regmatches(x, m) # or # extract every match with

[R] Minimal match to regexp?

2023-01-25 Thread Duncan Murdoch
The docs for ?regexp say this: "By default repetition is greedy, so the maximal possible number of repeats is used. This can be changed to ‘minimal’ by appending ? to the quantifier. (There are further quantifiers that allow approximate matching: see the TRE documentation.)" I want the