https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94174
--- Comment #2 from Richard Henderson <rth at gcc dot gnu.org> --- Case 3: void test3(__int128 a, unsigned long l) { if ((__int128_t)a - l <= 1) doit(); } currently generates as subs x0, x0, x2 sbc x1, x1, xzr cmp x1, 0 ble .L11 .L7: ret .L11: bne .L10 cmp x0, 1 bhi .L7 .L10: b doit but at least the bne + cmp can be ccmp x0, 1, #2, eq Note that clang attempts a branchless double-word comparison subs x8, x0, x2 sbcs x9, x1, xzr cmp x8, #1 cset w8, hi cmp x9, #0 cset w9, gt csel w8, w8, w9, eq tbnz w8, #0, .LBB0_2 we can do better than that: subs x8, x0, x2 sbcs x9, x1, xzr // x9 < 0 || (x9 == 0 && x8 <= 1) cset x10, lt ccmp x8, #1, #2, ne (nzCv: eq -> hi) ccmp x10, #0, #4, hi (nZcv: ls -> eq) b.eq .L10 It's not 100% clear this is better than the 2 branch version (with the ccmp), but at least it's no larger.