Hi!

The r222470 commit changed =x into =v constraint in *truncdfsf_fast_mixed.
The problem is that for some tunings we have a splitter
/* For converting DF(xmm2) to SF(xmm1), use the following code instead of
   cvtsd2ss:
      unpcklpd xmm2,xmm2   ; packed conversion might crash on signaling NaNs
      cvtpd2ps xmm2,xmm1
If the input operand is memory, it attempts to emit sse2_loadlpd
instruction.  But, that define_insn doesn't have any v constraints and so we
fail to recognize it.  For the vmovsd 2 operand m -> v instruction
*vec_concatv2df implements that too.
So I see 3 options for this:
1) as the patch does, emit *vec_concatv2df manually
2) rename *vec_concatv2df to vec_concatv2df and use gen_vec_concatv2df
   in the splitter; possibly use it instead of sse2_loadlpd there, because
   that insn has uglier/more complex pattern
3) tweak sse2_loadlpd - add various v alternatives to it, guard them with
   avx512vl isa, etc.

I bet the 3) treatment is desirable and likely many other instructions need
it, but that doesn't sound like stage4 material to me, I find it quite
risky, do you agree?  If yes, the following patch can work temporarily
(bootstrapped/regtested on x86_64-linux and i686-linux), or I can do 2),
but in that case I'd like to know your preferences about the suboption
(whether to replace gen_sse2_loadlpd with gen_vec_concatv2df or whether
to use it only for the EXT_REX_SSE_REG_P regs).

2016-03-04  Jakub Jelinek  <ja...@redhat.com>

        PR target/70086
        * config/i386/i386.md (truncdfsf2 splitter): Handle
        EXT_REX_SSE_REG_P destination with memory input.

        * gcc.target/i386/pr70086-1.c: New test.
        * gcc.target/i386/pr70086-2.c: New test.

--- gcc/config/i386/i386.md.jj  2016-03-02 14:09:50.000000000 +0100
+++ gcc/config/i386/i386.md     2016-03-04 22:56:32.206840674 +0100
@@ -4392,6 +4392,11 @@ (define_split
        operands[4] = simplify_gen_subreg (V2DFmode, operands[1], DFmode, 0);
       emit_insn (gen_vec_dupv2df (operands[4], operands[1]));
     }
+  else if (EXT_REX_SSE_REG_P (operands[4]))
+    /* Emit *vec_concatv2df.  */
+    emit_insn (gen_rtx_SET (operands[4],
+                           gen_rtx_VEC_CONCAT (V2DFmode, operands[1],
+                                               CONST0_RTX (DFmode))));
   else
     emit_insn (gen_sse2_loadlpd (operands[4],
                                 CONST0_RTX (V2DFmode), operands[1]));
--- gcc/testsuite/gcc.target/i386/pr70086-1.c.jj        2016-03-04 
23:01:07.447081169 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-1.c   2016-03-04 23:00:27.000000000 
+0100
@@ -0,0 +1,11 @@
+/* PR target/70086 */
+/* { dg-do compile } */
+/* { dg-options "-mtune=barcelona -mavx512vl -ffloat-store" } */
+
+float
+foo (float a, float b, double c, float d, double e, float f)
+{
+  e -= d;
+  d *= e;
+  return e + d;
+}
--- gcc/testsuite/gcc.target/i386/pr70086-2.c.jj        2016-03-04 
23:01:07.447081169 +0100
+++ gcc/testsuite/gcc.target/i386/pr70086-2.c   2016-03-04 23:00:27.000000000 
+0100
@@ -0,0 +1,12 @@
+/* PR target/70086 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mtune=barcelona -mavx512vl" } */
+
+float
+foo (double *p)
+{
+  register float xmm16 __asm ("xmm16");
+  xmm16 = *p;
+  asm volatile ("" : "+v" (xmm16));
+  return xmm16;
+}

        Jakub

Reply via email to