https://bugs.llvm.org/show_bug.cgi?id=41429
Bug ID: 41429
Summary: vector select optimization failures?
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Scalar Optimizations
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected]
>From IRC:
https://godbolt.org/z/NgGuw9
#include <immintrin.h>
#include <inttypes.h>
void test_restrict(__m256i * __restrict__ dest, const __m256i * __restrict__ a)
{
(*dest)[2] = (*a)[2];
(*dest)[3] = (*a)[3];
}
Currently results in:
define dso_local void @example2(<4 x i64>* noalias nocapture, <4 x i64>*
noalias nocapture readonly) local_unnamed_addr #0 {
%3 = load <4 x i64>, <4 x i64>* %1, align 32, !tbaa !2
%4 = load <4 x i64>, <4 x i64>* %0, align 32
%5 = shufflevector <4 x i64> %4, <4 x i64> %3, <4 x i32> <i32 0, i32 1, i32
6, i32 undef>
%6 = shufflevector <4 x i64> %5, <4 x i64> %3, <4 x i32> <i32 0, i32 1, i32
2, i32 7>
store <4 x i64> %6, <4 x i64>* %0, align 32
ret void
}
Since these are simple identity shuffles), can we combine this into a single
shuffle?
I'm guessing it is the `undef` that is the problem?
; Function Attrs: norecurse nounwind uwtable
define dso_local void @test_restrict(<4 x i64>* noalias nocapture, <4 x i64>*
noalias nocapture readonly) local_unnamed_addr #0 {
%3 = load <4 x i64>, <4 x i64>* %1, align 32, !tbaa !2
%4 = load <4 x i64>, <4 x i64>* %0, align 32
%5 = shufflevector <4 x i64> %4, <4 x i64> %3, <4 x i32> <i32 0, i32 1, i32
6, i32 7>
store <4 x i64> %5, <4 x i64>* %0, align 32
ret void
}
https://godbolt.org/z/NgGuw9
The final x86 asm is identical.
--
You are receiving this mail because:
You are on the CC list for the bug._______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs