[llvm-bugs] [Bug 49961] New: Bad codegen for vbslq_u32() intrinsic

via llvm-bugs Wed, 14 Apr 2021 10:34:46 -0700

https://bugs.llvm.org/show_bug.cgi?id=49961


            Bug ID: 49961
           Summary: Bad codegen for vbslq_u32() intrinsic
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: AArch64
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected],
                    [email protected], [email protected],
                    [email protected]

Consider:

int foo(uint32x4x2_t reg, uint32x4_t mask, int index) {
  return vbslq_u32(mask, reg.val[0], reg.val[1])[index];
}

clang vs gcc: https://gcc.godbolt.org/z/YPe3TK79P

clang trunk:
foo(uint32x4x2_t, __Uint32x4_t, int):    // @foo(uint32x4x2_t, __Uint32x4_t,
int)
        sub     sp, sp, #48                     // =48
        and     x8, x0, #0x3
        add     x10, sp, #32                    // =32
        str     q1, [sp, #32]
        mov     x9, sp
        add     x11, sp, #16                    // =16
        bfi     x10, x8, #2, #2
        and     v0.16b, v0.16b, v2.16b
        bfi     x9, x8, #2, #2
        bfi     x11, x8, #2, #2
        ldr     w8, [x10]
        str     q2, [sp, #16]
        ldr     w10, [x11]
        str     q0, [sp]
        ldr     w9, [x9]
        bic     w8, w8, w10
        orr     w0, w8, w9
        add     sp, sp, #48                     // =48
        ret

gcc trunk:
foo(uint32x4x2_t, __Uint32x4_t, int):
        bsl     v2.16b, v0.16b, v1.16b
        sub     sp, sp, #16
        str     q2, [sp]
        ldr     w0, [sp, w0, sxtw 2]
        add     sp, sp, 16
        ret

>From a cursory examination of what's going on, clang lowers vbslq_u32(mask, a,
b) to a vector "or(and(a, mask), and(b,~mask))", which the backend expects to
match. However, in this case, something in the midend decides it's best to
first extract the elements at "index" from both vectors, and then do the
or(and(), and()) song-and-dance in the scalar domain.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 49961] New: Bad codegen for vbslq_u32() intrinsic

Reply via email to