Re: [1003.1(2024)/Issue8 0001857]: Several problems with the new "lazy" regex quantifier.

Geoff Clare via austin-group-l at The Open Group Mon, 23 Sep 2024 08:33:14 -0700

Steffen Nurpmeso wrote, on 20 Sep 2024:
>
> Hello, for your possible interest.
> 
> Steffen Nurpmeso via austin-group-l at The Open Group wrote in
>  <20240916152314.Do4nw6pA@steffen%sdaoden.eu>:
[...]
>  |  It turns out the POSIX standard is ambiguous about this situation.
>  |  The grammar in the standard for concatenated regular expressions is a
>  |  left-associative grammar.  However, there is an example in the rationale
>  |  (not officially part of the standard...) that assumes concatenation is
>  |  is right-associative.
> 
> I was about to open a clarification issue, but could not find the
> quoted rationale, yet i got my hands on Mike Haertel's email
> address and thought i ask him.
> He was so nice to answer and he says that the above arose from
> memory from something read in the past, and that he is unable to
> find the exact quote now.
> So, he is about to search some more, and says he will revise the
> above paragraph in case he cannot find it.  Since i track his
> repository either a clarification issue will be opened, or not.


I assume the example Mike refers to above is this in A.9.1:

    For example, in the ERE "(a.*b)(a.*b)", the two identical
    subexpressions would match four and six characters, respectively,
    of accbaccccb.

I think this is required by the normative text (elsewhere than the
grammar), not assumed by the example as Mike says.  The relevant text
is in the definition of "matched" in 9.1:

    Consistent with the whole match being the longest of the leftmost
    matches, each subpattern, from left to right, shall match the
    longest possible string.

and it goes on to give an example:

    For example, matching the BRE "\(.*\).*" against "abcdef", the
    subexpression "(\1)" is "abcdef"

-- 
Geoff Clare <g.cl...@opengroup.org>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Re: [1003.1(2024)/Issue8 0001857]: Several problems with the new "lazy" regex quantifier.

Reply via email to