I searched man page for FreeBSD, DragonflyBSD, NetBSD, and OpenBSD, only 
DragonflyBSD has it, and it require an additional `REG_ENHANCED` flag to be set 
in the C regex functions:

- FreeBSD: 
man.freebsd.org/cgi/man.cgi?query=regex&apropos=0&sektion=0&manpath=FreeBSD+14.1-RELEASE&arch=default&format=html
- DragonflayBSD: 
man.freebsd.org/cgi/man.cgi?query=regex&apropos=0&sektion=0&manpath=DragonFly+6.4.0&arch=default&format=html
- NetBSD: 
man.freebsd.org/cgi/man.cgi?query=regex&apropos=0&sektion=0&manpath=NetBSD+10.0&arch=default&format=html
- OpenBSD: 
man.freebsd.org/cgi/man.cgi?query=regex&apropos=0&sektion=0&manpath=OpenBSD+7.5&arch=default&format=html

Next, I'd like to make a point:

We should say "left-most **least repetition**" rather than "left-most shortest".

When we say "shortest", we cause a confusion for the reader of the standard for 
this regex: `(a.+b)+?` - if it's "shortest, does it imply that the plus sign 
inside the parentheses adapts for the one outside to make the overall match 
shortest? Apparently no, and definitely no according to Dragonfly:

> Minimal Repetitions (available for enhanced extended REs only)
>        By default, the repetition operators, `*',  bound,  `?' and  `+'  are
>        greedy; they try to match as many times as possible.  In enhanced mode,
>        appending  a  `?'  to  a        repetition  operator  makes  it  
> minimal  (or
>        ungreedy); it tries to match the fewest number of times (including zero
>        times, as appropriate).

Next, they mentioned:

> In the current implementation, minimal repetitions have a  high prece-
> dence,  and can cause other standards requirements to be violated.  For
> instance, on the      string `aaaaa', the RE `(aaa??)*' will only match  the
> first four characters,  violating the rules that the longest possible
> match is made and the longest subexpressions are matched.

The standard however, has not loosened this requirement, or declare the 
behavior as undefined, making 3rd-party implementations un-interoperable. There 
are a few choices: 1. make the use of "lazy" qualifier undefined; 2. re-specify 
the requirement for the length of matches of subpatterns; 3. provide multiple 
options for implementors. Of which 1 and 3 appears to be the least preferable, 
and 2 the most difficult.

Overall, I think the introduction of lazy quantifiers is too hasty.

________________________________________
发件人: shwaresyst <[email protected]>
发送时间: 2024年9月15日 13:09
收件人: Niu Danny; Niu Danny via austin-group-l at The Open Group; Steffen Nurpmeso
主题: Re: 回复: 回复: [1003.1(2024)/Issue8 0001857]: Several problems with the new 
"lazy" regex quantifier.

According to 792 the primary referenced implementations were just on the 
mailing list, he didn't copy them to the bug. Somebody might be able to search 
2013 posts for them. BSD mentioned, so may be one of them.

On Sun, Sep 15, 2024 at 12:29 AM, Niu Danny via austin-group-l at The Open Group
<[email protected]> wrote:
I didn't mean which bug introduced that feature. I meant which 
**Implementation** did we base it on.

________________________________________

发件人: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>> 代表 Steffen 
Nurpmeso via austin-group-l at The Open Group 
<[email protected]<mailto:[email protected]>>
发送时间: 2024年9月15日 04:53
收件人: Niu Danny via austin-group-l at The Open Group
主题: Re: 回复: [1003.1(2024)/Issue8 0001857]: Several problems with the new "lazy" 
regex quantifier.

Niu Danny via austin-group-l at The Open Group wrote in
<tyapr01mb4992132ab4f0542d6eee5a4fc1...@tyapr01mb4992.jpnprd01.prod.outl\
ook.com>:
|The following issue has been SUBMITTED.
|======================================================================
|https://www.austingroupbugs.net/view.php?id=1857
...
|On a related note, can we disclose from which implementation(s)
|did we standardize REG_MINIMAL from? I checked, it's not GNU,
|at least not as of 2024-09-14.

This was [1], later corrected / adjusted by [2].

  [1] 
https://www.austingroupbugs.net/view.php?id=793<https://www.austingroupbugs.net/view.php?id=793>
  [2] 
https://www.austingroupbugs.net/view.php?id=1329<https://www.austingroupbugs.net/view.php?id=1329>

I would have to reread all that.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter          he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



  • [1003.1(2024... Austin Group Bug Tracker via austin-group-l at The Open Group
    • 回复: [1... Niu Danny via austin-group-l at The Open Group
      • Re:... Steffen Nurpmeso via austin-group-l at The Open Group
        • ... Niu Danny via austin-group-l at The Open Group
          • ... shwaresyst via austin-group-l at The Open Group
            • ... Niu Danny via austin-group-l at The Open Group
              • ... Niu Danny via austin-group-l at The Open Group
          • ... Geoff Clare via austin-group-l at The Open Group
          • ... Steffen Nurpmeso via austin-group-l at The Open Group
            • ... Steffen Nurpmeso via austin-group-l at The Open Group
              • ... Niu Danny via austin-group-l at The Open Group
                • ... Steffen Nurpmeso via austin-group-l at The Open Group
                • ... Mats Wichmann via austin-group-l at The Open Group
              • ... Steffen Nurpmeso via austin-group-l at The Open Group
                • ... Geoff Clare via austin-group-l at The Open Group
                • ... Steffen Nurpmeso via austin-group-l at The Open Group

Reply via email to