https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123330
Bug ID: 123330
Summary: Optimization fails with standard branchless idiom
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tobi at gcc dot gnu.org
Target Milestone: ---
The code gcc generated for this on x86-64 is surprisingly bad:
int bla(int x)
{
return (x != 0) * __builtin_clz(x) + (x == 0) * 64;
}
See here:
https://godbolt.org/z/v5fc4ddqr
It actually evaluates both the multiplication on the right and the addition,
where really this is just a conditional move.
BTW it doesn't matter that the first branch is using `__builtin_clz`, which is
undefined for zero argument (if not building with `-mlzcnt), but the intent was
to avoid that undefined case in a clean way. E.g.
int bla(int x)
{
return (x != 0) * (x + 1) + (x == 0) * 64;
}
the same behavior https://godbolt.org/z/ETe8Wcaer
(I selected tree-optimization as category because there is no general
optimization category, and the same behavior occurs on ARM if I interpret the
assembly correctly).