https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34723

--- Comment #17 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note GCC 10+ autovectorizes the loop to (which is better than clang, which
produces a lot of shuffles):
        movq    xmm0, QWORD PTR table[rip]
        pxor    xmm1, xmm1
        movdqa  xmm2, xmm0
        psadbw  xmm2, xmm1
        movq    rax, xmm2
        add     al, BYTE PTR table[rip+8]
        add     al, BYTE PTR table[rip+9]
        movsx   eax, al
        ret

Reply via email to