https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565
--- Comment #2 from Segher Boessenkool <segher at gcc dot gnu.org> --- Of course it first tried to do Failed to match this instruction: (parallel [ (set (reg:DI 101 [ _9 ]) (ctz:DI (reg/v:DI 98 [ x ]))) (set (reg:DI 100) (ctz:DI (reg/v:DI 98 [ x ]))) ]) so we could try to do that as just the ctz and then a register move, and hope that move can be optimised away. But this is more expensive if it can *not* be optimised (higher latency). Hrm.