Re: [pcre-dev] Start optimizations with partial match

2019-06-22 Thread ND via Pcre-dev
On 2019-06-23 04:33, ND wrote: Or this calculations occurs at compile time while partial matching flag is set at matchtime? Oh! Now I read docs about it. It seems that PARTIAL are compiletime option only for JIT. So it seems that disabling of this calculations may matter to JIT only. May

Re: [pcre-dev] Start optimizations with partial match

2019-06-22 Thread ND via Pcre-dev
Or this calculations occurs at compile time while partial matching flag is set at matchtime? -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev

[pcre-dev] Start optimizations with partial match

2019-06-22 Thread ND via Pcre-dev
Good day! Here is pcre2test listing: /(?<=ab)cde/info Capture group count = 0 Max lookbehind = 2 First code unit = 'c' Last code unit = 'e' Subject length lower bound = 3 ab\=ph Partial match: ab << We can see that PCRE calculates first code unit, last code unit and subject

Re: [pcre-dev] Max lookbehind calculation

2019-06-22 Thread ph10
On Sat, 22 Jun 2019, ND via Pcre-dev wrote: > PCRE2 version 10.33 2019-04-16 > /(?<=(?<=a)b)c.*/info > Capture group count = 0 > Max lookbehind = 1 > First code unit = 'c' > Subject length lower bound = 1 > abc\=ph > Partial match: bc > < > > Why max lookbehind=1, but not 2?

Re: [pcre-dev] Some words about assertion docs

2019-06-22 Thread ph10
On Sat, 22 Jun 2019, ND via Pcre-dev wrote: > Sorry for my bad English. > I need to find word that is closest to the end of text and occurs at least 10 > times in that text. Yes, I understand that now. I will think about it. Philip -- Philip Hazel -- ## List details at

Re: [pcre-dev] Some words about assertion docs

2019-06-22 Thread ND via Pcre-dev
On 2019-06-22 15:20, ph10 wrote: On Sat, 22 Jun 2019, ND via Pcre-dev wrote: Your example is not working right (let's change 10 to 3 for simplicity): >> /\A.*\b(\w++)(?>.*?\b\1\b){2}/ > word1 word1 word2 word2 word2 word1 > 0: word1 word1 word2 word2 word2 > 1: word2 >> We want to capture

Re: [pcre-dev] Document SKIP position before or equal start_offset

2019-06-22 Thread ph10
On Sat, 22 Jun 2019, ND via Pcre-dev wrote: > >If (*SKIP) is used inside a lookbehind to specify a new starting > >position... > > I suggest to remove "inside a lookbehind". > A new starting position that is not later than the starting point of the > current match may occur without lookbehind:

Re: [pcre-dev] Max lookbehind calculation

2019-06-22 Thread ND via Pcre-dev
I attempt to second try with another example: PCRE2 version 10.33 2019-04-16 /(?<=(?<=a)b)c.*/info Capture group count = 0 Max lookbehind = 1 First code unit = 'c' Subject length lower bound = 1 abc\=ph Partial match: bc < Why max lookbehind=1, but not 2? -- ## List details

Re: [pcre-dev] Some words about assertion docs

2019-06-22 Thread ph10
On Sat, 22 Jun 2019, ND via Pcre-dev wrote: > Your example is not working right (let's change 10 to 3 for simplicity): > > /\A.*\b(\w++)(?>.*?\b\1\b){2}/ > word1 word1 word2 word2 word2 word1 > 0: word1 word1 word2 word2 word2 > 1: word2 > > We want to capture "word1" as most closer to the end

Re: [pcre-dev] several messages

2019-06-22 Thread ph10
On Sat, 22 Jun 2019, ND via Pcre-dev wrote: > /\A(?:a|(?=b)|.){50}\z/ > abc > 0: abc > > when engine in a strange way decides that it was exactly 50 repetitions. That is not an unlimited repeat, so there is no special action for matching an empty string. Therefore, (?=b) matches 47 times. A

Re: [pcre-dev] Document SKIP position before or equal start_offset

2019-06-22 Thread ND via Pcre-dev
Updated docs: If (*SKIP) is used inside a lookbehind to specify a new starting position... I suggest to remove "inside a lookbehind". A new starting position that is not later than the starting point of the current match may occur without lookbehind: PCRE2 version 10.33 2019-04-16

Re: [pcre-dev] Some words about assertion docs

2019-06-22 Thread ND via Pcre-dev
On 2019-06-22 08:56, ph10 wrote: On Fri, 21 Jun 2019, ND via Pcre-dev wrote: Imagine that we have a text. There are some words in this text that occurs at > least 10 times. We want to find from they a word that is most closer to the > end of text. >> If lookahead assertion is

Re: [pcre-dev] several messages

2019-06-22 Thread ND via Pcre-dev
On 2019-06-22 08:51, ph10 wrote: There must be plenty of examples where removing \z changes what is matched. How about /[ab]*\z/ matched against "aaaxxxbbb"? I believed it was obviously that we told about matching from one position of subject. Sorry that I don't say it explicitly. In your

Re: [pcre-dev] Some words about assertion docs

2019-06-22 Thread ph10
On Fri, 21 Jun 2019, ND via Pcre-dev wrote: > Imagine that we have a text. There are some words in this text that occurs at > least 10 times. We want to find from they a word that is most closer to the > end of text. > > If lookahead assertion is non-possessive then we can use this pattern: > >

Re: [pcre-dev] several messages

2019-06-22 Thread ph10
On Sat, 22 Jun 2019, ND via Pcre-dev wrote: > Successfull match of "X*\z" means that PCRE says: X CAN be successfully > repeated until the very end of subject (let's the match is "abc" for example). > When we use "X*" we want to say: repeat X as much as it can. Yes, but there is special

Re: [pcre-dev] several messages

2019-06-22 Thread ND via Pcre-dev
Thanks a lot for clarifying docs and for your patience with me. On 2019-06-21 16:18, ph10 wrote: On Mon, 17 Jun 2019, ND via Pcre-dev wrote: Second of my little concern is that "X*\z" and "X*" both matches and matches are different. I understand why it is from procedural point of view.