https://bugs.llvm.org/show_bug.cgi?id=42123

            Bug ID: 42123
           Summary: [SelectionDAG] MergeConsecutiveStores loses
                    non-temporal flag
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Common Code Generator Code
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected], [email protected],
                    [email protected], [email protected],
                    [email protected]

https://godbolt.org/z/zWf3xk

Derived from (not direct copy of cpp source - alignment gets messed up):

#include <x86intrin.h>

void memcpy256_2_128_aligned(__m256 *src, __m256 *dst) {
    auto x = _mm_load_ps((float*)src + 0);
    auto y = _mm_load_ps((float*)src + 4);
    _mm_stream_ps((float*)dst + 0, x);
    _mm_stream_ps((float*)dst + 4, y);
}

define void @memcpy256_2_128_aligned(<8 x float>* noalias nocapture readonly,
<8 x float>* noalias nocapture) {
  %3 = bitcast <8 x float>* %0 to <4 x float>*
  %4 = load <4 x float>, <4 x float>* %3, align 32
  %5 = getelementptr inbounds <8 x float>, <8 x float>* %0, i64 0, i64 4
  %6 = bitcast float* %5 to <4 x float>*
  %7 = load <4 x float>, <4 x float>* %6, align 16
  %8 = bitcast <8 x float>* %1 to <4 x float>*
  store <4 x float> %4, <4 x float>* %8, align 32, !nontemporal !0
  %9 = getelementptr inbounds <8 x float>, <8 x float>* %1, i64 0, i64 4
  %10 = bitcast float* %9 to <4 x float>*
  store <4 x float> %7, <4 x float>* %10, align 16, !nontemporal !0
  ret void
}
!0 = !{i32 1}

llc -mcpu=btver2

memcpy256_2_128_aligned: # @memcpy256_2_128_aligned
  vmovaps (%rdi), %ymm0
  vmovaps %ymm0, (%rsi) <-- SHOULD BE VMOVNTPS
  retq

Several things need to be addressed:
1 - retain the nontemporal flag for merged stores
2 - don't merge stores if only some have a nontemporal flag
3 - only merges nontemporal if they are naturally aligned - unaligned nt-stores
are problematic (see [Bug #42026])

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to