https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005
--- Comment #8 from Richard Earnshaw ---
(In reply to Joel Holdsworth from comment #7)
> > Did you test it with big-endian?
>
> Good question. It seems to do the right thing in both cases:
> https://godbolt.org/z/7rDzAm
foo2(long*, __simd128_in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005
--- Comment #7 from Joel Holdsworth ---
> Did you test it with big-endian?
Good question. It seems to do the right thing in both cases:
https://godbolt.org/z/7rDzAm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005
--- Comment #6 from Richard Earnshaw ---
(In reply to Joel Holdsworth from comment #5)
> I found that if I make modified versions of the intrinsics in arm_neon.h
> that are designed more along the lines of the x86_64 SSE intrinsics defined
> with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005
--- Comment #5 from Joel Holdsworth ---
I found that if I make modified versions of the intrinsics in arm_neon.h that
are designed more along the lines of the x86_64 SSE intrinsics defined with a
simple pointer dereference, then gcc does the righ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005
--- Comment #4 from Joel Holdsworth ---
Results for clang and MSVC are similar:
clang trunk:
foo(__simd128_int32_t):
push{r11, lr}
mov r11, sp
sub sp, sp, #24
bfc sp, #0, #4
mov r0, sp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005
--- Comment #3 from Joel Holdsworth ---
Interesting. Comparing the implementation of _mm_store_si128 to vst1q_s32:
emminitrin.h
extern __inline void __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
_mm_store_si128 (__m128i *__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005
--- Comment #2 from Joel Holdsworth ---
Are you saying that if the GIMPLE were defined for the intrinsics, then the
optimizer would eliminate them automatically? Or is there more to it?