https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124000

            Bug ID: 124000
           Summary: std::regex::extended is too strict about escaped chars
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: redi at gcc dot gnu.org
            Blocks: 102445
  Target Milestone: ---

We throw an exception for this regex, which libc++ accepts:

#include <regex>
int main() {
  std::regex("\\{a\\}", std::regex::extended)
}

terminate called after throwing an instance of 'std::regex_error'
  what():  Invalid escape in regular expression

POSIX 2008 said:

  An ordinary character is an ERE that matches itself. An ordinary character is
any
  character in the supported character set, except for the ERE special
characters
  listed in ERE Special Characters. The interpretation of an ordinary character
  preceded by an unescaped <backslash> ( '\\' ) is undefined, except in the
context
  of a bracket expression (see ERE Bracket Expression).

So strictly speaking, we were correct to reject \\} and the regex should be
"\\{a}"

POSIX 2024 says:

  An ordinary character is an ERE that matches itself. An ordinary character is
any
  character in the supported character set, except for the ERE special
characters
  listed in 9.4.3 ERE Special Characters. When not inside a bracket expression,
the
  interpretation of an ordinary character preceded by an unescaped <backslash>
is
  undefined, except for the ']' and '}' characters; "\]" and "\}" shall match
the
  ']' and '}' characters, respectively.

So we should relax our implementation to accept \] and \} as equivalent to ]
and } respectively.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102445
[Bug 102445] [meta-bug] std::regex issues

Reply via email to