https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114341
--- Comment #3 from Kang-Che Sung <Explorer09 at gmail dot com> --- I missed one case that is more obvious: (1 << __builtin_ctz(y)) == (y & -y) Multiplication is not needed in this case, and thus (1 << __builtin_ctz(y)) can simplify to (y & -y). (I didn't think of a reason we need to optimize the other way around for this special case.)