Hello! > As I wrote at > > [PATCH, libcpp]: Use asm flag outputs in search_line_sse42 main loop > > https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg113610.html > > I wont repeat myself with reasons summary is that current sse4.2 code is > reduntant as it has same performance as sse2 one. > This improves sse2 performance by around 10% vs sse4.2 code by > using better header.
Have you tried new SSE4.2 implementation (the one with asm flags) with unrolled loop? Uros.