Re: [pcre-dev] Partial match at end of subject

ph10 Wed, 24 Jul 2019 08:53:27 -0700

On Wed, 24 Jul 2019, ND via Pcre-dev wrote:

> In terms of multisegment matching this may be say: partial hard match occurs
> when current segment is not last and it's content not enough to exactly
> determine, what match (or nomatch) would have WHOLE subject from this start
> position.


Yes, more or less.

I have already decided that I need to rewrite pcre2partial because the 
way things work has changed a lot since it was first written.

If I understand you correctly, your proposal would mean that every
non-anchored pattern would give a partial, empty-string, hard partial
match at the end of a non-matching segment, and never return "no match".
I do not like this idea. This is how I see it:

1. A return of "match" means "the pattern has matched in this segment."

2. A return of "no match" means "this segment definitely cannot be part 
of a match".

3. A return of "partial match" means "adding another segment may result 
in a match starting in this segment" where "starting" means the point
from where characters are inspected.

The tricky case is when the starting point is at the end of the 
segment and the pattern might match an empty string, because an empty 
string can be matched either at the end of the current segment or at the 
start of the next segment (which are, of course, the same place in the 
overall string). In this situation, we do not know whether the empty 
match will happen or whether adding more characters will produce a 
non-empty match. So in this very special case, "partial match" means 
"there is going to be a match at this point, but until some more 
characters are added, we do not know if it will be an empty string or 
something longer".

This is the /c*/ case, and all patterns that can match an empty string 
either have no character matches (trivial example: //) or use 
quantifiers with zero minima. I suspect this type of pattern is actually 
very rare in practice, especially in multi-segment matching scenarios.

Philip

-- 
Philip Hazel

-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev

Re: [pcre-dev] Partial match at end of subject

Reply via email to