https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94956

--- Comment #7 from Steinar H. Gunderson <steinar+gcc at gunderson dot no> ---
To wrap this up, confirming that GCC 11 does well on my benchmark:

BM_Chain20            54529 iterations      18781 ns/iter   GCC 10, asm bsfq
BM_Chain20            44584 iterations      22509 ns/iter   GCC 10, ffsll()
BM_Chain20            49753 iterations      20216 ns/iter   GCC 11, asm bsfq
BM_Chain20            53346 iterations      18816 ns/iter   GCC 11, ffsll()
BM_Chain20            64926 iterations      15747 ns/iter   Clang 12, asm bsfq
BM_Chain20            71208 iterations      14374 ns/iter   Clang 12, ffsll()

So basically for 11+, the ffsll() statement does better than the bsfq
statement, whereas it used to do markedly worse.

Clang does even better, but I can live with that. :-)

Reply via email to