https://bugs.llvm.org/show_bug.cgi?id=42123
Bug ID: 42123
Summary: [SelectionDAG] MergeConsecutiveStores loses
non-temporal flag
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Common Code Generator Code
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected], [email protected],
[email protected], [email protected],
[email protected]
https://godbolt.org/z/zWf3xk
Derived from (not direct copy of cpp source - alignment gets messed up):
#include <x86intrin.h>
void memcpy256_2_128_aligned(__m256 *src, __m256 *dst) {
auto x = _mm_load_ps((float*)src + 0);
auto y = _mm_load_ps((float*)src + 4);
_mm_stream_ps((float*)dst + 0, x);
_mm_stream_ps((float*)dst + 4, y);
}
define void @memcpy256_2_128_aligned(<8 x float>* noalias nocapture readonly,
<8 x float>* noalias nocapture) {
%3 = bitcast <8 x float>* %0 to <4 x float>*
%4 = load <4 x float>, <4 x float>* %3, align 32
%5 = getelementptr inbounds <8 x float>, <8 x float>* %0, i64 0, i64 4
%6 = bitcast float* %5 to <4 x float>*
%7 = load <4 x float>, <4 x float>* %6, align 16
%8 = bitcast <8 x float>* %1 to <4 x float>*
store <4 x float> %4, <4 x float>* %8, align 32, !nontemporal !0
%9 = getelementptr inbounds <8 x float>, <8 x float>* %1, i64 0, i64 4
%10 = bitcast float* %9 to <4 x float>*
store <4 x float> %7, <4 x float>* %10, align 16, !nontemporal !0
ret void
}
!0 = !{i32 1}
llc -mcpu=btver2
memcpy256_2_128_aligned: # @memcpy256_2_128_aligned
vmovaps (%rdi), %ymm0
vmovaps %ymm0, (%rsi) <-- SHOULD BE VMOVNTPS
retq
Several things need to be addressed:
1 - retain the nontemporal flag for merged stores
2 - don't merge stores if only some have a nontemporal flag
3 - only merges nontemporal if they are naturally aligned - unaligned nt-stores
are problematic (see [Bug #42026])
--
You are receiving this mail because:
You are on the CC list for the bug._______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs