https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124000
Bug ID: 124000
Summary: std::regex::extended is too strict about escaped chars
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: redi at gcc dot gnu.org
Blocks: 102445
Target Milestone: ---
We throw an exception for this regex, which libc++ accepts:
#include <regex>
int main() {
std::regex("\\{a\\}", std::regex::extended)
}
terminate called after throwing an instance of 'std::regex_error'
what(): Invalid escape in regular expression
POSIX 2008 said:
An ordinary character is an ERE that matches itself. An ordinary character is
any
character in the supported character set, except for the ERE special
characters
listed in ERE Special Characters. The interpretation of an ordinary character
preceded by an unescaped <backslash> ( '\\' ) is undefined, except in the
context
of a bracket expression (see ERE Bracket Expression).
So strictly speaking, we were correct to reject \\} and the regex should be
"\\{a}"
POSIX 2024 says:
An ordinary character is an ERE that matches itself. An ordinary character is
any
character in the supported character set, except for the ERE special
characters
listed in 9.4.3 ERE Special Characters. When not inside a bracket expression,
the
interpretation of an ordinary character preceded by an unescaped <backslash>
is
undefined, except for the ']' and '}' characters; "\]" and "\}" shall match
the
']' and '}' characters, respectively.
So we should relax our implementation to accept \] and \} as equivalent to ]
and } respectively.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102445
[Bug 102445] [meta-bug] std::regex issues