https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123901

            Bug ID: 123901
           Summary: Consider optimizing std::regex matching algos to
                    convert [first,last) to a std::basic_string and match
                    on that
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: redi at gcc dot gnu.org
            Blocks: 102445
  Target Milestone: ---

We could also consider turning input ranges that aren't pointers into pointers,
by constructing a basic_string<C> from the _BiIter range, then searching in
that basic_string. This is what libc++ does, so that the matching only ever
instantiates the executor for const C* not for arbitrary iterator types.

This reduces the number of instantiations, would always allow the optimized
_ExecutorFrame layout to be used, and would make increment and distance
operations on the iterators faster in some cases.

Even when the input uses std::string::iterator, that still requires the
compiler to optimize away the abstraction penalty of those iterator operations.

We would need a fix-up step at the end to turn the results back into the BiIter
types expected in the std::match_results object. Depending on how many
submatches there are, that might be expensive, so maybe only do this for
non-random-access iterators.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102445
[Bug 102445] [meta-bug] std::regex issues

Reply via email to