Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Sat, 14 Jul 2018, ND via Pcre-dev wrote: > PCRE2 version 10.31 2018-02-12 > /(*NO_START_OPT)\A(?>(*:1)a)((*:2)x|)/mark > ab > 0: a > 1: > MK: 1 > > Resulting mark is "1" when no backtracking is allowed to it. It just remembers "most recent mark" in the backtracking frame. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Thu, 12 Jul 2018, ND via Pcre-dev wrote: > And one more thing should also be clarified in docs: > MARK name unlike MARK position is saved outside assertion or atomic group: I have tried to clarify this. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Sun, 15 Jul 2018, ND via Pcre-dev wrote: > PCRE2 version 10.31 2018-02-12 > /(?>a(*:1))(?>b(*:1))(*SKIP:1)x|.*/ > abc > 0: bc > > > If MARK in atomic don't matter for SKIP then why result is "bc" and not "abc"? > If MARK in atomic matter for SKIP then why result is not "c"? This was an obscure bug, which got the backtracking wrong. It was even wrong for /(?>a(*:1))b(?>)(*SKIP:1)x|.*/ and I am amazed nobody spotted it earlier. The bug was in the interpreter; JIT did not have the bug. I have fixed it and committed the patch. Thanks for the report. The pattern now matches "abc", as does Perl. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Sun, 15 Jul 2018, ND via Pcre-dev wrote: > PCRE2 version 10.31 2018-02-12 > /(?>a(*:1))(?>b(*:1))(*SKIP:1)x|.*/ > abc > 0: bc > > > If MARK in atomic don't matter for SKIP then why result is "bc" and not "abc"? > If MARK in atomic matter for SKIP then why result is not "c"? This does look like a bug. Perl matches "abc" and that's what I would expect. I will investigate. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
And one more possibly bug: PCRE2 version 10.31 2018-02-12 /(?>a(*:1))(?>b(*:1))(*SKIP:1)x|.*/ abc 0: bc If MARK in atomic don't matter for SKIP then why result is "bc" and not "abc"? If MARK in atomic matter for SKIP then why result is not "c"? -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On 2018-07-14 15:12, ph10 wrote: Feel free to look at the code and suggest patches. However, I don'tthink is is easy. Sorry. I'm not ะก programmer. It doesn't have to do anything special when it passes a (*MARK:NAME) other than record a backtracking point. Then when (*SKIP:NAME) is triggered, it backtracks till it hits a matching (*MARK:NAME) and then the current position in the subject is where to bumpalong to. I wonder that engine not doing something special. Consider this example: PCRE2 version 10.31 2018-02-12 /(*NO_START_OPT)\A(?>(*:1)a)((*:2)x|)/mark ab 0: a 1: MK: 1 Resulting mark is "1" when no backtracking is allowed to it. So I can guess PCRE do a special: it copies a pointers to mark names to another memory places. May be it copies pointer to current mark in every backtracking frame. May be something else *special* tactic. Isn't it? Keeping a separate table would require memory management, and its own backtracking mechanism! If a branch that contains a (*MARK:NAME) fails to match, the (*MARK) must be forgotten. Consider /(xxx(*MARK:A)xxx|yyy(*MARK:A)yyy)...(*SKIP:A).../ Now I don't see any extended backtracking needs. Only unsetting of "Mark-Position" fields of table. Howbeit I see that such changes can be made in theory only after Perl changed accordingly. I can't neither report to Perl authors about this weird and unobvious behavior, nor programming in C. So you free to close this topic. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Sat, 14 Jul 2018, ND via Pcre-dev wrote: > It seems instead of maintaining only MarkNames PCRE can maintain a table with > MarkName-MarkPosition pares. And so not have need to backtrack to access MARK > position data. And not loose MarkPosition information. Feel free to look at the code and suggest patches. However, I don't think is is easy. At present, the matching engine does everything by backtracking. Note that this gives the same results as Perl. It doesn't have to do anything special when it passes a (*MARK:NAME) other than record a backtracking point. Then when (*SKIP:NAME) is triggered, it backtracks till it hits a matching (*MARK:NAME) and then the current position in the subject is where to bumpalong to. Keeping a separate table would require memory management, and its own backtracking mechanism! If a branch that contains a (*MARK:NAME) fails to match, the (*MARK) must be forgotten. Consider /(xxx(*MARK:A)xxx|yyy(*MARK:A)yyy)...(*SKIP:A).../ The *SKIP must activate whichever MARK matched, because they may have different bumpalong points. I think what you are suggesting would be very difficult to implement. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On 2018-07-14 07:16, ph10 wrote: >> Why it need to backtrack? > Why not do a "bumpalong" advance to the next starting character strait away? It has to backtrack to the *MARK because that is where the bumpalongdata is remembered. There may be many *MARKs, each with a differentname. You can't just keep a single data item. I think this is something weird here: PCRE during matching process have information about all passed MARK names and MARK positions. Information about MARK names is somewhere nearby and can be easily retrieved. Information about MARK positions is saved somewhere deep and can be retrieved only by backtracking. If no backtracking available then there is no way to access information. It seems instead of maintaining only MarkNames PCRE can maintain a table with MarkName-MarkPosition pares. And so not have need to backtrack to access MARK position data. And not loose MarkPosition information. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Sat, 14 Jul 2018, ND via Pcre-dev wrote: > On 2018-07-13 16:08, ph10 wrote: > >When SKIP has a name, it backtracks until it hits a MARK with the samename. > > > > Why it need to backtrack? > Why not do a "bumpalong" advance to the next starting character strait away? It has to backtrack to the *MARK because that is where the bumpalong data is remembered. There may be many *MARKs, each with a different name. You can't just keep a single data item. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On 2018-07-13 16:08, ph10 wrote: When SKIP has a name, it backtracks until it hits a MARK with the same name. Why it need to backtrack? Why not do a "bumpalong" advance to the next starting character strait away? -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Fri, 13 Jul 2018, ND via Pcre-dev wrote: > The SKIP verb don't need backtracking after it fires: there is bumpalong and > new match. If MARK position is saved then there is no problem for engine to > discard current matching and start new matching at saved position without any > backtracking. Isn't it? When SKIP has a name, it backtracks until it hits a MARK with the same name. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On 2018-07-13 07:23, ph10 wrote: On Thu, 12 Jul 2018, ND via Pcre-dev wrote: And one more thing should also be clarified in docs: > MARK name unlike MARK position is saved outside assertion or atomic group: The MARK position *is* saved; it's just that there is never a backtrack into an atomic group, so that data is never accessed. But I'll take a look at the wording again. The SKIP verb don't need backtracking after it fires: there is bumpalong and new match. If MARK position is saved then there is no problem for engine to discard current matching and start new matching at saved position without any backtracking. Isn't it? So may be not impossibility of backtracking into atomic group is the reason of current behavior. May be Perl compatibility is it? -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Thu, 12 Jul 2018, ND via Pcre-dev wrote: > And one more thing should also be clarified in docs: > MARK name unlike MARK position is saved outside assertion or atomic group: The MARK position *is* saved; it's just that there is never a backtrack into an atomic group, so that data is never accessed. But I'll take a look at the wording again. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On 2018-07-12 07:25, ph10 wrote: The (*MARK) is inside the assertion. That is what matters. I haveupdated the documentation to say this: The search for a (*MARK) name uses the normal backtracking mechanism, which means that it does not see (*MARK) settings that are inside atomic groups or assertions, because they are never re-entered by backtracking. And one more thing should also be clarified in docs: MARK name unlike MARK position is saved outside assertion or atomic group: PCRE2 version 10.31 2018-02-12 /a(?=.(*:1))/mark ab 0: a MK: 1 -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Wed, 11 Jul 2018, ND via Pcre-dev wrote: > I seen this docs before. > But in example verb not appears inside assertion. It appears after it. The (*MARK) is inside the assertion. That is what matters. I have updated the documentation to say this: The search for a (*MARK) name uses the normal backtracking mechanism, which means that it does not see (*MARK) settings that are inside atomic groups or assertions, because they are never re-entered by backtracking. Compare the following pcre2test examples: re> /a(?>(*MARK:X))(*SKIP:X)(*F)|(.)/ data: abc 0: a 1: a data: re> /a(?:(*MARK:X))(*SKIP:X)(*F)|(.)/ data: abc 0: b 1: b In the first example, the (*MARK) setting is in an atomic group, so it is not seen when (*SKIP:X) triggers, causing the (*SKIP) to be ignored. This allows the second branch of the pattern to be tried at the first character position. In the second example, the (*MARK) setting is not in an atomic group. This allows (*SKIP:X) to immediately cause a new matching attempt to start at the second character. This time, the (*MARK) is never seen because "a" does not match "b", so the matcher immediately jumps to the second branch of the pattern. This is exactly the same behaviour as Perl. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On 2018-07-11 16:27, ph10 wrote: This already appears in the docs: However, when one of these verbs appears inside an atomic group or in an assertion that is true, its effect is confined to that group, because once the group has been matched, there is never any backtracking into it. I seen this docs before. But in example verb not appears inside assertion. It appears after it. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Re: [pcre-dev] (*SKIP:NAME) when (*MARK:NAME) is in assertion
On Sun, 8 Jul 2018, ND via Pcre-dev wrote: > It seems if mark name is defined in assertion then SKIP with this name is > ignored. > May be a little docs clarification about this needed. This already appears in the docs: However, when one of these verbs appears inside an atomic group or in an assertion that is true, its effect is confined to that group, because once the group has been matched, there is never any backtracking into it. The "verbs" are the backtracking control verbs. Perl behaves in the same way. I will repeat the information above in some more places in the documentation. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev