https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91400
Bug ID: 91400 Summary: __builtin_cpu_supports conjunction is optimized poorly Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Clang 8 optimizes both f() and g() to the same code: bool f() { return __builtin_cpu_supports("popcnt") && __builtin_cpu_supports("ssse3"); } bool g() { extern unsigned int cpu_model; return (cpu_model & 64) && (cpu_model & 4); } f()/g(): mov eax, dword ptr [rip + cpu_model] and eax, 68 cmp eax, 68 sete al ret GCC generates this code only for g(). For f() GCC generates less optimal: f(): mov edx, DWORD PTR __cpu_model[rip+12] mov eax, edx shr eax, 6 and eax, 1 and edx, 4 mov edx, 0 cmove eax, edx ret I believe it would be great if GCC is able to generate the same code for f() too.