https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202
Bug ID: 102202 Summary: Inefficent expansion of memset when range is [0,1] Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: x86_64-*-* Take: void g(int a, char *d) { if (a < 0 || a > 1) __builtin_unreachable(); __builtin_memset(d, 0, a); } ----- CUT ----- GCC compiles on x86_64 to: g(int, char*): .cfi_startproc testl %edi, %edi je .L1 xorl %eax, %eax .L2: movl %eax, %edx addl $1, %eax movb $0, (%rsi,%rdx) cmpl %edi, %eax jb .L2 .L1: ret Which is better than clang/LLVM/ICC does but the loop is not needed as a will either be 0 or 1 and we already jump around the loop. Here is another example not using __builtin_unreachable: void g(int a, char *d) { __builtin_memset(d, 0, a&1); }