https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822
Bug ID: 101822
Summary: Codegen bug for popcount
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: llvm at rifkin dot dev
Target Milestone: ---
GCC cleverly optimizes the following loop into a popcount intrinsic:
uint32_t foo(uint32_t n) {
uint32_t count = 0;
while(n) {
n &= n - 1;
count++;
}
return count;
}
But the generated assembly is highly redundant https://godbolt.org/z/nbGb13G5W:
foo(unsigned int):
xor eax, eax
xor edx, edx
popcnt eax, edi
test edi, edi
cmove eax, edx
ret
if(n == 0) __builtin_unreachable(); does seem to help the compiler's analysis.
It seems here the compiler is not realizing both the loop and popcnt intrinsic
are well-defined for n == 0. This is closely related to another bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101821.