https://bugs.llvm.org/show_bug.cgi?id=41430

            Bug ID: 41430
           Summary: Why is this reload not folded away inless restrict?
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Global Analyses
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]

>From IRC:
https://godbolt.org/z/oVDACU

Given:
#include <immintrin.h>
#include <inttypes.h>
void example(__m256i *dest, const __m256i *a) {
    (*dest)[2] = (*a)[2];
    (*dest)[3] = (*a)[3];
}

We produce:
define dso_local void @example(<4 x i64>* nocapture, <4 x i64>* nocapture
readonly) local_unnamed_addr #0 {
  %3 = load <4 x i64>, <4 x i64>* %1, align 32, !tbaa !2
  %4 = load <4 x i64>, <4 x i64>* %0, align 32
  %5 = shufflevector <4 x i64> %4, <4 x i64> %3, <4 x i32> <i32 0, i32 1, i32
6, i32 3>
  store <4 x i64> %5, <4 x i64>* %0, align 32
  %6 = load <4 x i64>, <4 x i64>* %1, align 32, !tbaa !2
  %7 = shufflevector <4 x i64> %5, <4 x i64> %6, <4 x i32> <i32 0, i32 1, i32
2, i32 7>
  store <4 x i64> %7, <4 x i64>* %0, align 32
  ret void
}

So we load a and dest, blend the 3'th element into dest from a,
store dest, load dest back, blend the 4'th element into dest from a,
and finally store dest.

Why is that intermediate reloading there? The alignment is specified. 
I guess there could be a problem if they would point into overlapping
memory, but wouldn't that already be UB since that would mean one
of the pointers is misaligned?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to