[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484 --- Comment #10 from Andrew Pinski --- (In reply to Andrew Pinski from comment #9) > Fixed for GCC 7 by r7-5301 (aka PR 70118). - *(long long *)__P = ((__v2di)__B)[0]; + *(__m64_u *)__P = (__m64) ((__v2di)__B)[0];
[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484 Andrew Pinski changed: What|Removed |Added Component|tree-optimization |target See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=70118 Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #9 from Andrew Pinski --- Fixed for GCC 7 by r7-5301 (aka PR 70118).
[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484 --- Comment #5 from H.J. Lu --- (In reply to Richard Biener from comment #4) > As the summary mentions 'volatile' I'll also point to the implementation of > the intrinsics which have > > /* Store four SPFP values. The address must be 16-byte aligned. */ > extern __inline void __attribute__((__gnu_inline__, __always_inline__, > __artificial__)) > _mm_store_ps (float *__P, __m128 __A) > { > *(__v4sf *)__P = (__v4sf)__A; > } > > so they are not using a volatile qualified type to access *__P which means > the stores are not considered volatile by GCC. > > The arguments about strict-aliasing requirements still hold, only __m128 is > declared as __may_alias__: > > /* The Intel API is flexible enough that we must allow aliasing with other >vector types, and their scalar components. */ > typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); > > /* Internal data types for implementing the intrinsics. */ > typedef float __v4sf __attribute__ ((__vector_size__ (16))); > > so the v4sf store has regular TBAA rules applied (and the __may_alias__ on > the by value passed __A has no effect). > > -> target "bug", but I'd say an INVALID one. > > HJ, I remember the "master" copy of the intrinsics documentation is > somewhere at Intel - what does that say to the two above issues? > > Thus all of this boils down to the question whether the intrinsics are > implemented correctly (as documented). The volatile part of it would > mean to either pessimize all users or that we can't implement the > intrinsics as C functions. _mm_store_ps is documented in Intel SDM for movaps. It doesn't say anything about aliasing.
[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484 --- Comment #6 from Marc Glisse --- clang has: static __inline__ void __DEFAULT_FN_ATTRS _mm_storel_epi64(__m128i *__p, __m128i __a) { struct __mm_storel_epi64_struct { long long __u; } __attribute__((__packed__, __may_alias__)); ((struct __mm_storel_epi64_struct*)__p)->__u = __a[0]; } It doesn't prove anything, this is just a point of comparison. (I haven't checked, but I suspect that they have an implicit may_alias on all vector types)
[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484 Richard Biener changed: What|Removed |Added Target||x86_64-*-*, i?86-*-* Status|UNCONFIRMED |WAITING Last reconfirmed||2015-11-23 CC||hjl.tools at gmail dot com Component|c++ |target Ever confirmed|0 |1 --- Comment #4 from Richard Biener --- As the summary mentions 'volatile' I'll also point to the implementation of the intrinsics which have /* Store four SPFP values. The address must be 16-byte aligned. */ extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_store_ps (float *__P, __m128 __A) { *(__v4sf *)__P = (__v4sf)__A; } so they are not using a volatile qualified type to access *__P which means the stores are not considered volatile by GCC. The arguments about strict-aliasing requirements still hold, only __m128 is declared as __may_alias__: /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar components. */ typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); /* Internal data types for implementing the intrinsics. */ typedef float __v4sf __attribute__ ((__vector_size__ (16))); so the v4sf store has regular TBAA rules applied (and the __may_alias__ on the by value passed __A has no effect). -> target "bug", but I'd say an INVALID one. HJ, I remember the "master" copy of the intrinsics documentation is somewhere at Intel - what does that say to the two above issues? Thus all of this boils down to the question whether the intrinsics are implemented correctly (as documented). The volatile part of it would mean to either pessimize all users or that we can't implement the intrinsics as C functions.
[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484 --- Comment #7 from Vladimir Sedach --- The "store" pointer could be not only volatile, but also static or global with same error.
[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484 --- Comment #8 from Vladimir Sedach --- Adding "static" to "volatile" "solves" the problem: static int * volatile x = _x; I'm using this trick to avoid aggressive optimization when measuring the time of execution. The compiler does not skip calculations leading to saving the result in a loop.