https://bugs.llvm.org/show_bug.cgi?id=49961
Bug ID: 49961
Summary: Bad codegen for vbslq_u32() intrinsic
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: AArch64
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected],
[email protected], [email protected],
[email protected]
Consider:
int foo(uint32x4x2_t reg, uint32x4_t mask, int index) {
return vbslq_u32(mask, reg.val[0], reg.val[1])[index];
}
clang vs gcc: https://gcc.godbolt.org/z/YPe3TK79P
clang trunk:
foo(uint32x4x2_t, __Uint32x4_t, int): // @foo(uint32x4x2_t, __Uint32x4_t,
int)
sub sp, sp, #48 // =48
and x8, x0, #0x3
add x10, sp, #32 // =32
str q1, [sp, #32]
mov x9, sp
add x11, sp, #16 // =16
bfi x10, x8, #2, #2
and v0.16b, v0.16b, v2.16b
bfi x9, x8, #2, #2
bfi x11, x8, #2, #2
ldr w8, [x10]
str q2, [sp, #16]
ldr w10, [x11]
str q0, [sp]
ldr w9, [x9]
bic w8, w8, w10
orr w0, w8, w9
add sp, sp, #48 // =48
ret
gcc trunk:
foo(uint32x4x2_t, __Uint32x4_t, int):
bsl v2.16b, v0.16b, v1.16b
sub sp, sp, #16
str q2, [sp]
ldr w0, [sp, w0, sxtw 2]
add sp, sp, 16
ret
>From a cursory examination of what's going on, clang lowers vbslq_u32(mask, a,
b) to a vector "or(and(a, mask), and(b,~mask))", which the backend expects to
match. However, in this case, something in the midend decides it's best to
first extract the elements at "index" from both vectors, and then do the
or(and(), and()) song-and-dance in the scalar domain.
--
You are receiving this mail because:
You are on the CC list for the bug._______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs