On 2021-06-20 11:27, Philip Hazel wrote:
A little bit further up from what you quoted, the docs say this: "The two
"extended" options are not independent; unsetting either
one cancels the effects of both of them." So (?-x) and (?-xx) are the
same, and unset both (?x) and (?xx).
I apologize
PCRE docs say:
If the first character following (? is a circumflex, it causes all of
the above options to be unset. > Thus, (?^) is equivalent to (?-imnsx).
There is "xx" option. So may be docs have a typo?
- "all of the above options" -> "all of the above options but xx"
- or "(?^) is
On 2021-06-06 05:53, Zoltán Herczeg wrote:
ND I think you have found a pretty nice Perl bug, maybe you could report
it to them.
Zoltan, thank you for great investigation.
Now I sure it looks like a Perl bug.
Everybody feel free to report it. My English is bad and I have much
difficulties
Here is pcretest listing:
PCRE2 version 10.35 2020-05-09
/(?:(a)?\1)+/
aaa
0: aaa
Expected result:
0: aa
Perl result:
0: aa
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
After some think I thought that second listing have no inconsistence.
You free to forget about it.
Let's consider only first one.
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Here is 2 pcretest listings:
1.
PCRE2 version 10.35 2020-05-09
/a\K.(?0)*/
abac
0: bac
Expected result: c
Perl's result: c
2.
/a\Kb/replace=-$0-
ab
1: a-b-
PCRE doc's says about $ substitutions:
"The number may be zero to include the entire matched string."
But really we can see
PCRE2 version 10.34-RC1 2019-04-22
/\A(?:\1b|(?=(a)))*?\z/
ab
No match
/\A(?:\1b|(?=(a)))*\z/
ab
No match
Both patterns must successfully match after second iteration.
But PCRE2 have following rule:
It is possible to construct infinite loops by following a group that can
match no
Good day!
There is a function find_minlength()in pcre2_study.c that calculates min
subject length.
1. It drops patterns that have (*ACCEPT) verb.
/* ACCEPT makes things far too complicated; we have to give up. In
fact,
from 10.34 onwards, if a pattern contains (*ACCEPT), this
писал(а) в своём письме Sat, 10 Aug 2019 14:03:50
+0300:
- bugs
- performance issues
- brings excessive work to user
Now I report only about potential bugs.
Unfortunately I believe we have reached the limit of what can be done to
the existing PCRE2 design to support multi-segment matching
On 2019-08-08 16:59, ph10 wrote:
On Sat, 3 Aug 2019, ND via Pcre-dev wrote:
May be it can be useful to have ability to set a limits of lookbehind
search
> for performance reasons.
> I can imagine a rule: If nonfixedlength lookbehind immediately
preceded by
> capt
May be it can be useful to have ability to set a limits of lookbehind
search for performance reasons.
I can imagine a rule: If nonfixedlength lookbehind immediately preceded by
capture group, then it is restricted to start position of this group.
For example in pattern
abc(\w++)(?<=\d+)
On 2019-08-03 04:44, Zoltán Herczeg wrote:
I was faced with a need of nonfixed length lookbehind two times:
> 1. when data came by stream of 24kB blocks and I need to find a last
>numeric in each of it
> /.{24000}(?<=(\d++)\D*+)/g
Even if this would work, the result of this would be always the
On 2019-08-01 08:20, Zoltán Herczeg wrote:
If we would use your idea for doing (0,n-1) match, that could be too
slow for large subject, and people would complain.
Yes, it could be slow. But:
- we can use [\G,n-1]
- we can honestly warn about it in docs. Performance of X(?<=Y) will be
On 2019-07-29 10:45, Zoltán Herczeg wrote:
I am open to other names, but I would propose the following control
verbs:
(*MOVE:mark_name)
- This verb changes the current string position to the position
recorded by the last mark which name is mark_name.
(*SETEND:mark_name)
- This verb
Good day!
pcre2test output:
/b(?Why "a" showed as text that was consulted during a successful pattern
match, but "c" not?
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
On 2019-07-27 10:50, ND wrote:
\A and \G may be considered as (?
It seems it no truth for \G.
\G need more investigation
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Good day!
There are some kinds of problems that exist with max_lookbehind:
- bugs
- performance issues
- brings excessive work to user
Now I report only about potential bugs.
Here is pcre2test output:
PCRE2 version 10.34-RC1 2019-04-22
/(?<=(?=(?<=.)).)/info,allusedtext
Capture group
On 2019-07-27 08:46, ph10 wrote:
For an anchored pattern, the "must be present" code unit value is set
only if it follows a variable length item in the pattern.
This is a judgement that it will probably be faster in most cases and it
will avoid the really bad case: suppose, instead of "abx"
On 2019-07-24 15:51, ph10 wrote:
If I understand you correctly, your proposal would mean that every
non-anchored pattern would give a partial, empty-string, hard partial
match at the end of a non-matching segment, and never return "no match".
Yes. And I like this idea.
With it we could
On 2019-07-23 20:20, ND wrote:
On 2019-07-22 17:32, ND wrote:
Now it can be useful to try putting into words, what exactly in
applying > to multisegment matching means "local no match" and what
means "partial > match".
>
Doc's says:
A partial match occurs during a call to pcre2_match() when
On 2019-07-22 17:32, ND wrote:
Now it can be useful to try putting into words, what exactly in applying
to multisegment matching means "local no match" and what means "partial
match".
Doc's says:
A partial match occurs during a call to pcre2_match() when the end of the
subject string
On 2019-07-22 16:32, ph10 wrote:
The characteristic of these is that the pattern can match an empty
string. I have now added this condition (which was easily done with no
repeated test) and those patterns now give partial matches.
It's excellent!!
Now it can be useful to try putting into
New algorithm still have another parts of discussed oversight. For example
it returns full match instead of partial in following cases:
/(?![ab]).*/
ab\=ph
0:
/c*+/
ab\=ph,offset=2
0:
Alternative suggestion don't have this troubles. It simplify calculations
that main application must
On 2019-07-18 16:48, ph10 wrote:
On Wed, 17 Jul 2019, ND via Pcre-dev wrote:
Let us ignore for the moment whether there should be a new option or
not, and try to figure out what new logic might be needed. I am going to
experiment with the suggestion I made earlier:
If a hard partial match
On 2019-07-17 16:55, ph10 wrote:
On Mon, 15 Jul 2019, ND via Pcre-dev wrote:
This option is added ten years ago EXACTLY for multisegment matching.
> Please read a very first proposal post and thread about it. Thats how
> partial_hard is born:
> https://lists.exim.org/lurke
On 2019-07-17 09:00, ph10 wrote:
On Sat, 13 Jul 2019, I wrote:
> May be "[^a]" can use the same algorithm as "[^ab]"?
>> [^a] is optimized into a different (faster) opcode; I will see if this
> can easily produce the same starting code units as [^ab] for tidyness.
I
> do not expect it will
On 2019-07-14 11:54, ph10 wrote:
I am sorry that I cannot help, but I don't even use Windows, let alone
MSVC. All the information I put in NON-AUTOTOOLS-BUILD was sent to me by
other people.
Thanks. Now I can successfully compile PCRE svn versions.
It achieved not directly by MSVisualStudio
On 2019-06-05 15:54, ph10 wrote:
On Wed, 5 Jun 2019, ND via Pcre-dev wrote:
May be there is a some space for optimization there.
>> PCRE analyze subpattern in lookaround and say:
> First code unit = 'a'
> Last code unit = 'c'
>> So it already knows that "Subject length
On 2019-07-15 15:24, ph10 wrote:
My point about partial matching meaning "may be incomplete" is still
true. Partial matching was not invented originally for multi-segement
matching, but for dynamically checking input. For example, if a user is
typing an 8-digit number, as each character is
On 2019-07-13 19:21, ND wrote:
On 2019-07-13 16:50, ph10 wrote:
> On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
>> Unfortunately PCRE2 svn version is not compiled for me with Microsoft
> Visual
> > Studio 2019 on Windows 7x64.
>Can you compile the released source versi
On 2019-07-13 16:50, ph10 wrote:
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
Unfortunately PCRE2 svn version is not compiled for me with Microsoft
Visual
> Studio 2019 on Windows 7x64.
Can you compile the released source versions? (There shouldn't be any
difference, but I just wonde
On 2019-07-13 16:47, ph10 wrote:
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:
PCRE2_PARTIAL_HARD is intended for multisegment matching. I think when
this
> option is set it means: this subject IS incomplete, it's only a
non-last part
> of a certain "entire" subject.
It wa
On 2019-07-13 11:22, ph10 wrote:
I have done this work, and committed the patches. The new code supports
both (*napla: and (*naplb:
It's great! Thanks a lot!
I was meat a (*napla necessity some time ago when try to construct a
pattern for this task:
I review a text of research article with
On 2019-07-13 11:44, ph10 wrote:
In this case PCRE2 finds a *complete* match before it finds a partial
match. The pattern says "assert we are at the end of the subject"; that
is true. Then it says "end of pattern" - so it returns a complete match.
It never gets the chance to consider a partial
Good day!
PCRE try to detect starting code units in attempt to apply a start
optimization.
As we can see from next two examples, it detects starting code units for
"[^ab]", but don't doing this for "[^a]". I think it looks a bit curiously.
May be "[^a]" can use the same algorithm as
On 2019-07-12 15:31, ND wrote:
On 2019-07-12 15:17, ph10 wrote:
> On Fri, 12 Jul 2019, ND via Pcre-dev wrote:
>> This is about my second example.
> > But it seems first example have another issue:
> >> >PCRE2 version 10.33 2019-04-16
> > >/(?<=(?=.(?&l
On 2019-07-12 15:17, ph10 wrote:
On Fri, 12 Jul 2019, ND via Pcre-dev wrote:
This is about my second example.
> But it seems first example have another issue:
>> >PCRE2 version 10.33 2019-04-16
> >/(?<=(?=.(?<=x)))/
> >ab\=ph
> >Partial match: b
>>
On 2019-07-12 07:08, ph10 wrote:
On Thu, 11 Jul 2019, ND via Pcre-dev wrote:
I guess you told about second example (in first example "x" don't
adds). I
> believed empty match at the end of string is not counted as partial.
This is a documentation issue. Instead of "empty
On 2019-07-11 16:18, ph10 wrote:
Why? "Partial match" means "if you add some more characters to the
subject, it MAY match". If you add "x", it matches.
I guess you told about second example (in first example "x" don't adds). I
believed empty match at the end of string is not counted as
Good day!
Here is 2 pcre2test listings:
PCRE2 version 10.33 2019-04-16
/(?<=(?=.(?<=x)))/
ab\=ph
Partial match: b
/(?<=.(?=x))/
ab\=ph
Partial match: b
<
Isn't both results should be "no match" instead of "partial match"?
Thanks.
--
## List details at
On 2019-07-09 13:53, ph10 wrote:
On Mon, 8 Jul 2019, ND via Pcre-dev wrote:
And if we disregards Perl's bugs then it seems (*COMMIT) in Perl works
in a
> following manner:
>> 1. Backtracking can't move to the left of COMMIT (this is PCRE
behaviour too)
> 2. If COMMIT occurs the
And if we disregards Perl's bugs then it seems (*COMMIT) in Perl works in
a following manner:
1. Backtracking can't move to the left of COMMIT (this is PCRE behaviour
too)
2. If COMMIT occurs then no advance match to any other position of subject
can happen. No matter there are any other
Good day!
PCRE2 version 10.33 2019-04-16
/(?0)/
abc
Failed: error -52: nested recursion at the same subject position
As I can see interpreter recognize this endless recursion right away.
But JIT don't. It recursed unless memory is run out:
Failed: error -46: JIT stack limit reached
May
On 2019-06-22 16:03, ph10 wrote:
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
Sorry for my bad English.
> I need to find word that is closest to the end of text and occurs at
least 10
> times in that text.
Yes, I understand that now. I will think about it.
Non-atomic lookaroun
On 2019-07-03 17:33, ph10 wrote:
On Tue, 2 Jul 2019, ND via Pcre-dev wrote:
It seems a Perl is so buggy or have really different conception of
(*COMMIT)
> then PCRE.
I am waiting for further information from the Perl developers, but I
suspect that I won't want to change PCRE2, except perh
On 2019-07-02 14:34, ph10 wrote:
A Perl developer has admitted there is some ambiguity, but suggests that
(*COMMIT) just means "never advance the starting point". That patterncan
find a match without advancing the starting point. I have pointedout
that, in that case, /.*(*COMMIT)c/ should
On 2019-07-01 10:28, ph10 wrote:
On Sun, 30 Jun 2019, ND via Pcre-dev wrote:
PCRE2 version 10.33 2019-04-16
> /\A(?:.|..)(*THEN)c/
> abc
> No match
>>> Perl is match "abc".
> I suppose "next innermost alternative" is interpreted differently by
PCRE
On 2019-07-01 10:28, ph10 wrote:
I think this is a bug in Perl and I will report it as such.
It's great.
As you participate in Perl regex development can you take a look at
another Perl bug please:
PCRE2 version 10.33 2019-04-16
/\A(?:.(*COMMIT))*c/
abcd
No match
But Perl reports
Good day!
Here is pcre2test listing:
PCRE2 version 10.33 2019-04-16
/\A(?:.|..)(*THEN)c/
abc
No match
Perl is match "abc".
I suppose "next innermost alternative" is interpreted differently by PCRE
and Perl.
If so, may be PCRE should go Perl way in this matter?
Thanks.
--
## List
On 2019-06-25 09:30, Zoltán Herczeg wrote:
> It seems JIT is 16 times!! slower than interpreter for such simple
pattern.
I did some improvements on the SSE2 accelerated search and /(?s).*/
search. You can try them now. However I have never seen such big
differences in my measurements. The
On 2019-06-23 04:33, ND wrote:
Or this calculations occurs at compile time while partial matching flag
is set at matchtime?
Oh! Now I read docs about it.
It seems that PARTIAL are compiletime option only for JIT. So it seems
that disabling of this calculations may matter to JIT only. May
Or this calculations occurs at compile time while partial matching flag is
set at matchtime?
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Good day!
Here is pcre2test listing:
/(?<=ab)cde/info
Capture group count = 0
Max lookbehind = 2
First code unit = 'c'
Last code unit = 'e'
Subject length lower bound = 3
ab\=ph
Partial match: ab
<<
We can see that PCRE calculates first code unit, last code unit and
subject
On 2019-06-22 15:20, ph10 wrote:
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:
Your example is not working right (let's change 10 to 3 for
simplicity):
>> /\A.*\b(\w++)(?>.*?\b\1\b){2}/
> word1 word1 word2 word2 word2 word1
> 0: word1 word1 word2 word2 word2
> 1: word2
>
I attempt to second try with another example:
PCRE2 version 10.33 2019-04-16
/(?<=(?<=a)b)c.*/info
Capture group count = 0
Max lookbehind = 1
First code unit = 'c'
Subject length lower bound = 1
abc\=ph
Partial match: bc
<
Why max lookbehind=1, but not 2?
--
## List details
Updated docs:
If (*SKIP) is used inside a lookbehind to specify a new starting
position...
I suggest to remove "inside a lookbehind".
A new starting position that is not later than the starting point of the
current match may occur without lookbehind:
PCRE2 version 10.33 2019-04-16
On 2019-06-22 08:56, ph10 wrote:
On Fri, 21 Jun 2019, ND via Pcre-dev wrote:
Imagine that we have a text. There are some words in this text that
occurs at
> least 10 times. We want to find from they a word that is most closer
to the
> end of text.
>> If lookahead asse
On 2019-06-22 08:51, ph10 wrote:
There must be plenty of examples where removing \z changes what is
matched. How about /[ab]*\z/ matched against "aaaxxxbbb"?
I believed it was obviously that we told about matching from one position
of subject. Sorry that I don't say it explicitly.
In your
Thanks a lot for clarifying docs and for your patience with me.
On 2019-06-21 16:18, ph10 wrote:
On Mon, 17 Jun 2019, ND via Pcre-dev wrote:
Second of my little concern is that "X*\z" and "X*" both matches and
matches
are different.
I understand why it is from proc
On 2019-06-20 15:40, ph10 wrote:
(?:(?=X)|(?=Y))Z means "if X matches, try to match Z; if that fails, if
Y matches try to match Z". In the simple case the second match of Z will
be the same as the first, so will always fail. However, if X and Y are
complex and contain capturing parentheses, I
On 2019-06-20 16:29, ph10 wrote:
I have updated the documentation.
Updated docs:
If (*SKIP) is used inside a lookbehind to specify a new starting point
that is
not later than the starting point of the current match, it is ignored, and
the
normal "bumpalong" occurs.
May be "it is
On 2019-06-20 16:15, ph10 wrote:
You can see all this by making use of the "auto-callout" feature
Thanks a lot, Philip. I quite well understand what is really happened.
My concern is about how this is documented.
In the first example, the same thing happens, but after (?=b) ismatched,
\z
On 2019-06-20 15:53, ph10 wrote:
I have updated the doc to use your example, but it can be done easily
with other PCRE2 facilities:
(?|(ab)c|(a))
does the same thing. If "a" is complex, and you do not want to write it
out twice, you could DEFINE it and use a subroutine call.
I don't say
On 2019-06-19 20:00, Zoltán Herczeg wrote:
Assertions are like "if" statements in structured languages. A condition
part of an "if" is never retried.
(?=x|y) looks much more ergonomical than (?:(?=x)|(?=y))
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Good day!
In ASSERTIONS chapter I can't find words that assertions are atomic. This
information can be seen much far for this chapter in backtracking control
verbs part.
It can be important IMHO to put this info in ASSERTIONS chapter.
But why assertions are atomic? I guess answer is:
On 2019-06-19 17:15, ph10 wrote:
At present, lookarounds do not take part in minimum length calculations,
I see lookarounds takes part: first and last code units are searched in
lookarounds too.
So this is another reason in opposition to my poroposal.
So I suggest to close this thread.
(*ACCEPT) can't leave lookaround borders. So ACCEPT's that are inside
lookarounds can't influence minimum length claculation, if lookaround
entrails are not participate in this calculation (is this true?).
Thus more preferable may be to turn off minimum length scan not for all
patterns
On 2019-06-17 15:44, ph10 wrote:
Why do you expect 4? The matcher goes back 2, then matches two
characters, so it is back at the start. Then it goes back 6.
You are right, Philip. My fault. I'm sorry.
Close thread.
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
It seems you don't understand or I don't. Sorry for my bad English.
I don't ask to calculate real subject_length_lower_bound in patterns with
ACCEPT.
I ask to set subject_length_lower_bound to 0 in all such patterns.
On 2019-06-17 15:07, ph10 wrote:
If a pattern contains (*ACCEPT) the
Hello!
Here is pcre2test listing:
PCRE2 version 10.33 2019-04-16
/(?<=.{2}(?<=.{6}))/info
Capture group count = 0
Max lookbehind = 6
May match empty string
Subject length lower bound = 0
abc\=ph
No match
Expected maxlookbehind=4, not 6.
May be calculation algorithm could be corrected.
Hello!
Chapter ISSUES WITH MULTI-SEGMENT MATCHING of pcre2partial.html includes
item 2 with description how to process with lookbehind assertions.
I think it's important to add to this algorithm a some words about "no
match":
If result of partial match is "no match" then last
Hello!
In pcre2test docs in chapter RESTARTING AFTER A PARTIAL MATCH there is
example:
data> 23ja\=P,dfa
What matching option "P" is? May be it should be corrected to "ph" or "ps"?
Thanks.
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
On 2019-06-04 16:59, ND wrote:
1. Start optimizer brakes a result to "no match" from "match". Is there
documented (I remember only example with (*COMMIT) where optimizer can
make "match" from "no match")? May be there is a way to correct this
PCRE optimization to not break a result.
I
Good day!
Here is pcre2test listing:
PCRE2 version 10.33 2019-04-16
/(?:a|ab){1}+c/
abc
0: abc
No match expected, but pattern matched.
Thanks.
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
Good day!
I don't find in docs behaviour of SKIP when corresponding position is
before or equal start_offset.
It seems that in this case a "bumpalong" advance is 1, not SKIP or
associated MARK position.
/(?<=a(*SKIP)x)|c/
abcd\=offset=2
No match
/(*SKIP)x|c/
abcd
No match
Good day!
Docs says:
It is possible to construct infinite loops by following a group that can
match no characters with a quantifier that has no upper limit, for example:
(a?)*
Earlier versions of Perl and PCRE1 used to give an error at compile time
for such patterns. However, because
On 2019-06-10 16:47, ph10 wrote:
I have done this, and committed the result. However, it seems to me that
/a(*ACCEPT)??bc/ is the same as a(?:bc|) though if a, b, and c are
complex it may be easier to read.
A following example was included in docs (pcre2pattern.html) :
A(*ACCEPT)??BC
Good day!
pcre2test unlike Perl don't report MARK value that is insight a successful
condition of condition group.
PCRE2 version 10.33 2019-04-16
/a(?(?=(*:1)b).)/mark
ab
0: ab
May be this incompatibility should be fixed.
Thank you.
--
## List details at
On 2019-06-05 16:53, ph10 wrote:
Perl gets it wrong:
/(a(?:(*ACCEPT))??bc)/
axy
No match
/a(*ACCEPT)??bc/
axy
No match
It seems a bug of Perl start optimizer. It say:
"Did not find floating substr "bc"...
Match rejected by optimizer"
Please look at PCRE start optimizer. It seems correction
On 2019-06-05 08:16, ph10 wrote:
Because PCRE2 isn't clever enough to deal with lookarounds whencomputing
the minimum length.
May be there is a some space for optimization there.
PCRE analyze subpattern in lookaround and say:
First code unit = 'a'
Last code unit = 'c'
So it already knows
Repetition is allowed for groups such as (?:...) but not for individual
backtracking verbs
It seems Perl does not rise error with "(*ACCEPT)??". And generates
expected code.
Is there weighty reason to be not compatible with Perl in this situation?
(for which it is meaningless).
It's
Good day!
pcre2test:
PCRE2 version 10.33 2019-04-16
/(?=abc)/I
Capture group count = 0
May match empty string
First code unit = 'a'
Last code unit = 'c'
Subject length lower bound = 0
Why Subject length lower bound = 0, not 3?
--
## List details at
Good day!
Here is pcre2test listing:
PCRE2 version 10.33 2019-04-16
/A(?:(*ACCEPT))?B/
A
No match
/A(?:(*ACCEPT))?B/no_start_optimize
A
0: A
/A(*ACCEPT)?B/
Failed: error 109 at offset 10: quantifier does not follow a repeatable
item
A
I have a two questions with it:
1. Start
On 2019-05-29 16:52, ph10 wrote:
On Wed, 29 May 2019, ND via Pcre-dev wrote:
Since anybody put MARK verb at the beginning of pattern then it is
assumed
> that this verb is definitely needed in pattern logic.
But maybe only for successful matches?
So is there any reason to ap
Since anybody put MARK verb at the beginning of pattern then it is assumed
that this verb is definitely needed in pattern logic.
So is there any reason to apply to such patterns optimizations that can
discard that MARK?
May be automatically disabling of such optimizations is reasonable.
Good day!
pcre2api.html document:
There are also other start-up optimizations. For example, a minimum length
for the subject may be recorded. Consider the pattern
(*MARK:A)(X|Y)
The minimum length for a match is one character. If the subject is "ABC",
there will be attempts to match
Zoltán Herczeg писал(а) в своём письме Mon, 27 May
2019 11:06:39 +0300:
Optimizing the possessive dot is a good idea. I will do it.
I feel that interpreter optimizes not only posessive ".*", but any ".*" in
DotAll mode.
Isn't it?
--
## List details at
On 2018-07-22 15:09, ph10 wrote:
Consider /(?| (?foo) | (?bar) )/x
The table will tell you "group 1 is called A" and "group 1 is called B".
What happens if you match the pattern with "foo" and then ask "what is
the value of group B?". The table will tell you that group B is group 1,
and
On 2018-07-21 16:29, ph10 wrote:
The feature was added by creating a table that translates
a group number to a group name. This means that for each number, there
must only be one name.
May be a table that translates a group name to group number can be more
useful?
--
## List details at
Good day.
I meet incompatibility with Perl when trying to use in PCRE valid Perl
pattern:
PCRE2 version 10.31 2018-02-12
/(?|(?)|(?))/
Failed: error 165 at offset 15: different names for subpatterns of the
same number are not allowed
a
Docs say:
"Warning: You cannot use different names to
And one more possibly bug:
PCRE2 version 10.31 2018-02-12
/(?>a(*:1))(?>b(*:1))(*SKIP:1)x|.*/
abc
0: bc
If MARK in atomic don't matter for SKIP then why result is "bc" and not
"abc"?
If MARK in atomic matter for SKIP then why result is not "c"?
--
## List details at
On 2018-07-14 15:12, ph10 wrote:
Feel free to look at the code and suggest patches. However, I don'tthink
is is easy.
Sorry. I'm not С programmer.
It doesn't have to do anything
special when it passes a (*MARK:NAME) other than record a backtracking
point. Then when (*SKIP:NAME) is
On 2018-07-14 07:16, ph10 wrote:
>> Why it need to backtrack?
> Why not do a "bumpalong" advance to the next starting character strait
away?
It has to backtrack to the *MARK because that is where the bumpalongdata
is remembered. There may be many *MARKs, each with a differentname. You
On 2018-07-13 16:08, ph10 wrote:
When SKIP has a name, it backtracks until it hits a MARK with the same
name.
Why it need to backtrack?
Why not do a "bumpalong" advance to the next starting character strait
away?
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
On 2018-07-13 07:23, ph10 wrote:
On Thu, 12 Jul 2018, ND via Pcre-dev wrote:
And one more thing should also be clarified in docs:
> MARK name unlike MARK position is saved outside assertion or atomic
group:
The MARK position *is* saved; it's just that there is never a backtr
On 2018-07-12 16:55, ph10 wrote:
There are no subroutine calls in the second, so it looks like a Perl bug.
You are right. It's a bug.
I will report it.
Can you please also report about Perl inconsistence that we discuss in "No
capture in nested negative assertions"?
--
## List
On 2018-07-12 07:25, ph10 wrote:
The (*MARK) is inside the assertion. That is what matters. I haveupdated
the documentation to say this:
The search for a (*MARK) name uses the normal backtracking mechanism,
which means that it does not see (*MARK) settings that are inside
atomic groups or
On 2018-07-11 16:27, ph10 wrote:
This already appears in the docs:
However, when one of these verbs appears inside an atomic group or in
an assertion that is true, its effect is confined to that group,
because once the group has been matched, there is never any
backtracking into it.
I
On 2018-07-10 11:31, ph10 wrote:
Perl 5.026002 Regular Expressions
/(?!(a)b)/
a
0: 1: a
/(?!(a)b|ac)/
a
0:/(?!ac|(a)b)/
a
0: It seems to save the capture only if there is just one branch in the
assertion. Or maybe it has some algorithm for deciding on which branchto
try first ... I don't
On 2018-07-10 04:48, ND wrote:
On 2018-07-09 09:25, ph10 wrote:
>If any branch in a negative assertion succeeds, the captures are>
(temporarily) kept, but as the whole assertion now fails, there is an>
external backtrack, which discards the captures.
>
To what point backtracking is?
I guess
1 - 100 of 114 matches
Mail list logo