https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121912

            Bug ID: 121912
           Summary: std::search optimizations for bidirectional and random
                    access iterators.
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

Given std::search(haystack, haystack_end, needle, needle_end) we always do a
linear search  through the whole haystack.

If haystack and needle are both random access iterators, we can return early if
at any time during the algo:

  (haystack_end - haystack) < (needle_end - needle)

If haystack is random access but needle is not, we can still optimize the algo
by keeping track of a lower bound estimate for distance(needle, needle_end).
Every time we find *needle in the haystack and start iterating to see if the
rest of [needle,needle_end) matches, we can count how many matches there are,
and update our lower bound estimate for distance(needle, needle_end). Then we
can return early if haystack_end - haystack is smaller than that estimate.

Even if haystack is not random access but is bidirectional, we can _still_
optimize the algo. Every time we match an element from the needle we can
decrement a copy of haystack_end and on the next iteration use that decremented
iterator as the end for the next loop's find_if(haystack, end, *needle). 

There's never any need for that find_if call to look all the way to
haystack_end, because by the time we reach the for (;;) loop we know that the
needle is at least length 2.


We can also apply these optimizations to ranges::search, using the
sized_sentinel_for concept to extend the optimizations to sized
non-random-access ranges.

Reply via email to