https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94956
Bug ID: 94956
Summary: Unable to remove impossible ffs() test for zero
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: steinar+gcc at gunderson dot no
Target Milestone: ---
Hi,
Given GCC 10 x86-64, and this code:
#include <stdint.h>
#include <string.h>
int foo(uint32_t x) {
if (x == 0) __builtin_unreachable();
return ffs(x) - 1;
}
I get this assembler:
atum17:~> gcc-10 -O2 -c test.c
atum17:~> objdump --disassemble test.o
The cmovne test is rather expensive for me due to high instruction latency,and
I can never have zero in my situation. (It costs ~10% in a much larger graph
algorithm.) I'm unable to get GCC to understand that it doesn't need it, save
for using an explicit asm statement.
By contrast, Clang 10 gets this right:
atum17:~> clang-10 -O2 -c test.c
atum17:~> objdump --disassemble test.o
test.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <foo>:
0: 0f bc c7 bsf %edi,%eax
3: c3 retq
Is it possible to get access to the raw instruction by some clever means? :-)