https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202

            Bug ID: 102202
           Summary: Inefficent expansion of memset when range is [0,1]
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-*

Take:
void g(int a, char *d)
{
  if (a < 0 || a > 1) __builtin_unreachable();
  __builtin_memset(d, 0, a);
}

----- CUT -----
GCC compiles on x86_64 to:
g(int, char*):
        .cfi_startproc
        testl   %edi, %edi
        je      .L1
        xorl    %eax, %eax
.L2:
        movl    %eax, %edx
        addl    $1, %eax
        movb    $0, (%rsi,%rdx)
        cmpl    %edi, %eax
        jb      .L2
.L1:
        ret

Which is better than clang/LLVM/ICC does but the loop is not needed as a will
either be 0 or 1 and we already jump around the loop.

Here is another example not using __builtin_unreachable:
void g(int a, char *d)
{
  __builtin_memset(d, 0, a&1);
}

Reply via email to