https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94174

--- Comment #2 from Richard Henderson <rth at gcc dot gnu.org> ---
Case 3:

void test3(__int128 a, unsigned long l)
{
  if ((__int128_t)a - l <= 1)
    doit(); 
}

currently generates as

        subs    x0, x0, x2
        sbc     x1, x1, xzr
        cmp     x1, 0
        ble     .L11
.L7:
        ret
.L11:
        bne     .L10
        cmp     x0, 1
        bhi     .L7
.L10:
        b       doit

but at least the bne + cmp can be 

        ccmp     x0, 1, #2, eq

Note that clang attempts a branchless double-word comparison

        subs    x8, x0, x2
        sbcs    x9, x1, xzr
        cmp     x8, #1
        cset    w8, hi
        cmp     x9, #0
        cset    w9, gt
        csel    w8, w8, w9, eq
        tbnz    w8, #0, .LBB0_2

we can do better than that:

        subs    x8, x0, x2
        sbcs    x9, x1, xzr
        // x9 < 0 || (x9 == 0 && x8 <= 1)
        cset    x10, lt
        ccmp    x8, #1, #2, ne    (nzCv: eq -> hi)
        ccmp    x10, #0, #4, hi   (nZcv: ls -> eq)
        b.eq    .L10

It's not 100% clear this is better than the 2 branch
version (with the ccmp), but at least it's no larger.

Reply via email to