https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087
Alexander Monakov <amonakov at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amonakov at gcc dot gnu.org --- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> --- (In reply to Uroš Bizjak from comment #1) > BTW: If the result of foo is random, then cmove gets badly predicted. > Considering the problems with cmove on x86 (even without bad prediction), > the above optimization can be quite important. Clang does it. There is no prediction involved in execution of CMOV. It is one ALU uop with latency 1 on any recent x86, or two uops with latency 1 on Haswell, going back to Intel Core: https://uops.info/html-instr/CMOVZ_R32_R32.html If you convert a control dependency to a data dependency with CMOV you may end up with slower code due to longer dependency chains, but this is not the case here.