https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822
Bug ID: 101822 Summary: Codegen bug for popcount Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: llvm at rifkin dot dev Target Milestone: --- GCC cleverly optimizes the following loop into a popcount intrinsic: uint32_t foo(uint32_t n) { uint32_t count = 0; while(n) { n &= n - 1; count++; } return count; } But the generated assembly is highly redundant https://godbolt.org/z/nbGb13G5W: foo(unsigned int): xor eax, eax xor edx, edx popcnt eax, edi test edi, edi cmove eax, edx ret if(n == 0) __builtin_unreachable(); does seem to help the compiler's analysis. It seems here the compiler is not realizing both the loop and popcnt intrinsic are well-defined for n == 0. This is closely related to another bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101821.