[Bug target/93005] Redundant NEON loads/stores from stack are not eliminated

2020-01-07 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005 --- Comment #8 from Richard Earnshaw --- (In reply to Joel Holdsworth from comment #7) > > Did you test it with big-endian? > > Good question. It seems to do the right thing in both cases: > https://godbolt.org/z/7rDzAm foo2(long*, __simd128_in

[Bug target/93005] Redundant NEON loads/stores from stack are not eliminated

2020-01-06 Thread joel at airwebreathe dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005 --- Comment #7 from Joel Holdsworth --- > Did you test it with big-endian? Good question. It seems to do the right thing in both cases: https://godbolt.org/z/7rDzAm

[Bug target/93005] Redundant NEON loads/stores from stack are not eliminated

2020-01-06 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005 --- Comment #6 from Richard Earnshaw --- (In reply to Joel Holdsworth from comment #5) > I found that if I make modified versions of the intrinsics in arm_neon.h > that are designed more along the lines of the x86_64 SSE intrinsics defined > with

[Bug target/93005] Redundant NEON loads/stores from stack are not eliminated

2020-01-06 Thread joel at airwebreathe dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005 --- Comment #5 from Joel Holdsworth --- I found that if I make modified versions of the intrinsics in arm_neon.h that are designed more along the lines of the x86_64 SSE intrinsics defined with a simple pointer dereference, then gcc does the righ

[Bug target/93005] Redundant NEON loads/stores from stack are not eliminated

2020-01-06 Thread joel at airwebreathe dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005 --- Comment #4 from Joel Holdsworth --- Results for clang and MSVC are similar: clang trunk: foo(__simd128_int32_t): push{r11, lr} mov r11, sp sub sp, sp, #24 bfc sp, #0, #4 mov r0, sp

[Bug target/93005] Redundant NEON loads/stores from stack are not eliminated

2020-01-03 Thread joel at airwebreathe dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005 --- Comment #3 from Joel Holdsworth --- Interesting. Comparing the implementation of _mm_store_si128 to vst1q_s32: emminitrin.h extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_store_si128 (__m128i *__

[Bug target/93005] Redundant NEON loads/stores from stack are not eliminated

2020-01-02 Thread joel at airwebreathe dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005 --- Comment #2 from Joel Holdsworth --- Are you saying that if the GIMPLE were defined for the intrinsics, then the optimizer would eliminate them automatically? Or is there more to it?