https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123262

            Bug ID: 123262
           Summary: x86 optimization: Missed combining sub and cmp on
                    subtract overflow patterns
           Product: gcc
           Version: 15.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

This is more of an optimization feature request than a bug, but I believe these
patterns for "subtract and overflow check" are common, and GCC missing these
could be annoying (at least to me).

Test code

```c
#define unlikely(x) __builtin_expect(!!(x), 0)
unsigned long func1_a(unsigned long x, unsigned long y) {
    if (unlikely(x < y))
        return 0x123;
    return x - y;
}
unsigned long func1_b(unsigned long x, unsigned long y) {
    if (unlikely(x - y > x))
        return 0x123;
    return x - y;
}
unsigned long func1_ideal(unsigned long x, unsigned long y) {
    if (__builtin_usubl_overflow(x, y, &x))
        return 0x123;
    return x;
}
void func2_a(unsigned long x, unsigned long y) {
    while (1) {
        __asm__ ("" ::: "memory");
        if (unlikely(x < y))
            break;
        x -= y;
    }
}
void func2_b(unsigned long x, unsigned long y) {
    while (1) {
        __asm__ ("" ::: "memory");
        if (unlikely(x - y > x))
            break;
        x -= y;
    }
}
void func2_ideal(unsigned long x, unsigned long y) {
    while (1) {
        __asm__ ("" ::: "memory");
        if (__builtin_usubl_overflow(x, y, &x))
            break;
    }
}
```

Compiler Explorer link: https://godbolt.org/z/9TKbxna85

x86-64 gcc 15.2 with `-Os` option:

```assembly
func1_a:
        movq    %rdi, %rax
        movl    $291, %edx
        subq    %rsi, %rax
        cmpq    %rsi, %rdi
        cmovb   %rdx, %rax
        ret
func1_ideal:
        subq    %rsi, %rdi
        movl    $291, %eax
        cmovnb  %rdi, %rax
        ret
func2_a:
.L14:
        cmpq    %rsi, %rdi
        jb      .L12
        subq    %rsi, %rdi
        jmp     .L14
.L12:
        ret
func2_ideal:
.L20:
        subq    %rsi, %rdi
        jnb     .L20
        ret
```

Note that in the test code above I specifically added the "unlikely" macro as
an optimization hint, and compiles the code with `-Os` optimization. I can
think of a few reasons GCC would choose not to combine the `sub` and `cmp`
instructions to one, therefore I suggest this optimization be performed in
`-Os` or `-Oz`, or when the (x < y) overflow conditions are marked as
"unlikely".

Clang can perform the optimization in the func1 case (but it missed on func2,
see https://github.com/llvm/llvm-project/issues/170675 ).

The func1 case may also get optimized in GCC for ARM64 (except there's a minir
issue that I've reported in bug 123009). It's not yet in x86-64.

Reply via email to