Hi!

On the following testcase, simplify_unary_operation is called on
VEC_DUPLICATE from (vec_duplicate:V4SF something:SF) to V8SFmode,
and simplify_unary_operation_1 tries an optimization usable for most
unary operations, in particular it attempts to do
(vec_duplicate:V8SF (unary:SF something:SF))
which is reasonable for all unary ops other than when unary is
vec_duplicate, because that needs a vector outer mode and scalar or vector
inner mode, not scalar outer and inner mode.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2018-03-20  Jakub Jelinek  <ja...@redhat.com>

        PR rtl-optimization/84989
        * simplify-rtx.c (simplify_unary_operation_1): Don't try to simplify
        VEC_DUPLICATE with scalar result mode.

        * gcc.target/i386/pr84989.c: New test.

--- gcc/simplify-rtx.c.jj       2018-01-20 10:52:47.000000000 +0100
+++ gcc/simplify-rtx.c  2018-03-20 17:13:11.906809795 +0100
@@ -1692,7 +1692,9 @@ simplify_unary_operation_1 (enum rtx_cod
       break;
     }
 
-  if (VECTOR_MODE_P (mode) && vec_duplicate_p (op, &elt))
+  if (VECTOR_MODE_P (mode)
+      && vec_duplicate_p (op, &elt)
+      && code != VEC_DUPLICATE)
     {
       /* Try applying the operator to ELT and see if that simplifies.
         We can duplicate the result if so.
--- gcc/testsuite/gcc.target/i386/pr84989.c.jj  2018-03-20 17:20:46.840921141 
+0100
+++ gcc/testsuite/gcc.target/i386/pr84989.c     2018-03-20 17:19:57.257903317 
+0100
@@ -0,0 +1,12 @@
+/* PR rtl-optimization/84989 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f" } */
+
+#include <x86intrin.h>
+
+__m512
+foo (float a, float *b)
+{
+  return _mm512_sub_ps (_mm512_broadcast_f32x4 (_mm_load_ps (b)),
+                       _mm512_broadcast_f32x4 (_mm_set1_ps (a)));
+}

        Jakub

Reply via email to