[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr

2021-12-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

--- Comment #10 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #9)
> Fixed for GCC 7 by r7-5301 (aka PR 70118).

-  *(long long *)__P = ((__v2di)__B)[0];
+  *(__m64_u *)__P = (__m64) ((__v2di)__B)[0];

[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr

2021-12-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

Andrew Pinski  changed:

   What|Removed |Added

  Component|tree-optimization   |target
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=70118
 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Andrew Pinski  ---
Fixed for GCC 7 by r7-5301 (aka PR 70118).

[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr

2015-11-23 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

--- Comment #5 from H.J. Lu  ---
(In reply to Richard Biener from comment #4)
> As the summary mentions 'volatile' I'll also point to the implementation of
> the intrinsics which have
> 
> /* Store four SPFP values.  The address must be 16-byte aligned.  */
> extern __inline void __attribute__((__gnu_inline__, __always_inline__,
> __artificial__))
> _mm_store_ps (float *__P, __m128 __A)
> {
>   *(__v4sf *)__P = (__v4sf)__A;
> }
> 
> so they are not using a volatile qualified type to access *__P which means
> the stores are not considered volatile by GCC.
> 
> The arguments about strict-aliasing requirements still hold, only __m128 is
> declared as __may_alias__:
> 
> /* The Intel API is flexible enough that we must allow aliasing with other
>vector types, and their scalar components.  */
> typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
> 
> /* Internal data types for implementing the intrinsics.  */
> typedef float __v4sf __attribute__ ((__vector_size__ (16)));
> 
> so the v4sf store has regular TBAA rules applied (and the __may_alias__ on
> the by value passed __A has no effect).
> 
> -> target "bug", but I'd say an INVALID one.
> 
> HJ, I remember the "master" copy of the intrinsics documentation is
> somewhere at Intel - what does that say to the two above issues?
> 
> Thus all of this boils down to the question whether the intrinsics are
> implemented correctly (as documented).  The volatile part of it would
> mean to either pessimize all users or that we can't implement the
> intrinsics as C functions.

_mm_store_ps is documented in Intel SDM for movaps.  It doesn't say
anything about aliasing.

[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr

2015-11-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

--- Comment #6 from Marc Glisse  ---
clang has:

static __inline__ void __DEFAULT_FN_ATTRS
_mm_storel_epi64(__m128i *__p, __m128i __a)
{
  struct __mm_storel_epi64_struct {
long long __u;
  } __attribute__((__packed__, __may_alias__));
  ((struct __mm_storel_epi64_struct*)__p)->__u = __a[0];
}

It doesn't prove anything, this is just a point of comparison.
(I haven't checked, but I suspect that they have an implicit may_alias on all
vector types)

[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr

2015-11-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

Richard Biener  changed:

   What|Removed |Added

 Target||x86_64-*-*, i?86-*-*
 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2015-11-23
 CC||hjl.tools at gmail dot com
  Component|c++ |target
 Ever confirmed|0   |1

--- Comment #4 from Richard Biener  ---
As the summary mentions 'volatile' I'll also point to the implementation of the
intrinsics which have

/* Store four SPFP values.  The address must be 16-byte aligned.  */
extern __inline void __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
_mm_store_ps (float *__P, __m128 __A)
{
  *(__v4sf *)__P = (__v4sf)__A;
}

so they are not using a volatile qualified type to access *__P which means
the stores are not considered volatile by GCC.

The arguments about strict-aliasing requirements still hold, only __m128 is
declared as __may_alias__:

/* The Intel API is flexible enough that we must allow aliasing with other
   vector types, and their scalar components.  */
typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));

/* Internal data types for implementing the intrinsics.  */
typedef float __v4sf __attribute__ ((__vector_size__ (16)));

so the v4sf store has regular TBAA rules applied (and the __may_alias__ on
the by value passed __A has no effect).

-> target "bug", but I'd say an INVALID one.

HJ, I remember the "master" copy of the intrinsics documentation is somewhere
at Intel - what does that say to the two above issues?

Thus all of this boils down to the question whether the intrinsics are
implemented correctly (as documented).  The volatile part of it would
mean to either pessimize all users or that we can't implement the
intrinsics as C functions.

[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr

2015-11-23 Thread vvsed at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

--- Comment #7 from Vladimir Sedach  ---
The "store" pointer could be not only volatile, but also static or global with
same error.

[Bug target/68484] _mm_storel_epi64((__m128i *)x, m); does nothing if "x" is a "volatile" ptr

2015-11-23 Thread vvsed at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484

--- Comment #8 from Vladimir Sedach  ---
Adding "static" to "volatile" "solves" the problem:

static int * volatile x = _x;

I'm using this trick to avoid aggressive optimization when measuring the time
of execution. The compiler does not skip calculations leading to saving the
result in a loop.