https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123901
Bug ID: 123901
Summary: Consider optimizing std::regex matching algos to
convert [first,last) to a std::basic_string and match
on that
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: redi at gcc dot gnu.org
Blocks: 102445
Target Milestone: ---
We could also consider turning input ranges that aren't pointers into pointers,
by constructing a basic_string<C> from the _BiIter range, then searching in
that basic_string. This is what libc++ does, so that the matching only ever
instantiates the executor for const C* not for arbitrary iterator types.
This reduces the number of instantiations, would always allow the optimized
_ExecutorFrame layout to be used, and would make increment and distance
operations on the iterators faster in some cases.
Even when the input uses std::string::iterator, that still requires the
compiler to optimize away the abstraction penalty of those iterator operations.
We would need a fix-up step at the end to turn the results back into the BiIter
types expected in the std::match_results object. Depending on how many
submatches there are, that might be expensive, so maybe only do this for
non-random-access iterators.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102445
[Bug 102445] [meta-bug] std::regex issues