Re: [1003.1(2024)/Issue8 0001857]: Several problems with the new "lazy" regex quantifier.

Geoff Clare via austin-group-l at The Open Group Thu, 26 Sep 2024 05:11:56 -0700

Geoff Clare wrote, on 26 Sep 2024:
>
> However, if perl is the origin of the non-greedy modifier then that
> would point to perl as the origin of the shortest vs. least repetitions
> issue.  And that does indeed seem to be the case.  Looking at
> https://perldoc.perl.org/perlre it says:
> 
>     By default, a quantified subpattern is "greedy", that is, it will
>     match as many times as possible (given a particular starting
>     location) while still allowing the rest of the pattern to match.
>     If you want it to match the minimum number of times possible,
>     follow the quantifier with a "?".
> 
> This, of course, states incorrectly how greedy subpatterns work.  They
> don't match "as many times as possible", they give the longest
> possible match.  The code doesn't match the documentation.
> 
> There are two conventions for greedy/non-greedy that make sense:
> 
> 1. Greedy is longest, non-greedy is shortest.
> 
> 2. Greedy is as many times as possible, non-greedy as few times as possible.
> 
> Convention 1 is used for greedy everywhere (as far as I know). By
> mixing up the conventions when implementing non-greedy REs, perl has a
> design flaw that others have copied, but tre has not copied and instead
> done it right.


I retract this statement.  As per the email I just sent in another
part of the thread, perl tries alternatives in order, which means
there is no difference between the two conventions.

POSIX implementations choose between alternatives based on which
gives the longest match (with greedy repetitions) and so does have
a difference between the two conventions. It is imperative that we
do not mix up the two conventions in POSIX, and therefore should
continue to specify the macOS/tre behaviour of shortest match for
non-greedy repetitions.

-- 
Geoff Clare <[email protected]>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Re: [1003.1(2024)/Issue8 0001857]: Several problems with the new "lazy" regex quantifier.

Reply via email to