https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48037

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
           Severity|normal                      |enhancement

--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The last part of this is SLP of sqrt.  I don't know if it is because we don't
do SLP for functions or we don't treat vector constructors as store points.

Note clang/LLVM can do SLP on x86 (but not on aarch64) and get the vectorized
sqrt.

Here is a generic testcase where we should do the SLP:
#include <math.h>
typedef __attribute__((vector_size(16))) double __m128d;

static inline __m128d _mm_set_pd(double a, double b)
{
    return (__m128d){b,a};
}

__m128d vsqrt1 (__m128d const& x)
{
  double const* __restrict__ const y = (double const*)&x;
  double const a = sqrt(y[0]);
  double const b = sqrt(y[1]);
  return _mm_set_pd(b,a);
}

__m128d vsqrt2 (__m128d const x)
{
  double const* __restrict__ const y = (double const*)&x;
  double const a = sqrt(y[0]);
  double const b = sqrt(y[1]);
  return _mm_set_pd(b,a);
}

Reply via email to