Hello!

The assert in create_pre_exit at mode-switching.c expects return copy
pair with nothing in between. However, the compiler starts mode
switching pass with the following sequence:

(insn 19 18 16 2 (set (reg:V2SF 21 xmm0)
        (mem/c:V2SF (plus:DI (reg/f:DI 7 sp)
                (const_int -72 [0xffffffffffffffb8])) [0  S8 A64]))
"pr88070.c":8 1157 {*movv2sf_internal}
     (nil))
(insn 16 19 20 2 (set (reg:V2SF 0 ax [orig:91 <retval> ] [91])
        (reg:V2SF 0 ax [89])) "pr88070.c":8 1157 {*movv2sf_internal}
     (nil))
(insn 20 16 21 2 (unspec_volatile [
            (const_int 0 [0])
        ] UNSPECV_BLOCKAGE) "pr88070.c":8 710 {blockage}
     (nil))
(insn 21 20 23 2 (use (reg:V2SF 21 xmm0)) "pr88070.c":8 -1
     (nil))

Please note how (insn 16) interferes with (insn 19)-(insn 21) return copy pair.

The culprit for this is the blockage instruction (insn 20), which
causes sched1 pass (pre reload scheduler) to skip marking (insn 19) as
unmovable instruction (as a dependent insn on return use insn), so the
scheduler is free to schedule (insn 16) between return copy pair (insn
19)-(insn 21).

The extra instruction is generated as a kludge in expand_function_end
to prevent instructions that may trap to be scheduled into function
epilogue. However, the same blockage is generated under exactly the
same conditions earlier in the expand_function_end. The first blockage
should prevent unwanted scheduling into the epilogue, so there is
actually no need for the second one.

Attached patch removes the kludge.

BTW: The extra blockage would crash compilation for all mode-switching
targets, also in the pre-reload mode switching; the vzeroupper
post-reload insertion just trips x86 target on a generic problem in
the middle-end.

2018-11-19  Uros Bizjak  <ubiz...@gmail.com>

    PR middle-end/88070
    * function.c (expand_function_end): Remove kludge that
    generates second blockage insn.

testsuite/ChangeLog:

2018-11-19  Uros Bizjak  <ubiz...@gmail.com>

    PR middle-end/88070
    * gcc.target/i386/pr88070.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32} for all default languages, obj-c++ and go.

OK for mainline and release branches?

Uros.
Index: function.c
===================================================================
--- function.c  (revision 266278)
+++ function.c  (working copy)
@@ -5447,13 +5447,6 @@ expand_function_end (void)
   if (naked_return_label)
     emit_label (naked_return_label);
 
-  /* @@@ This is a kludge.  We want to ensure that instructions that
-     may trap are not moved into the epilogue by scheduling, because
-     we don't always emit unwind information for the epilogue.  */
-  if (cfun->can_throw_non_call_exceptions
-      && targetm_common.except_unwind_info (&global_options) != UI_SJLJ)
-    emit_insn (gen_blockage ());
-
   /* If stack protection is enabled for this function, check the guard.  */
   if (crtl->stack_protect_guard && targetm.stack_protect_runtime_enabled_p ())
     stack_protect_epilogue ();
Index: testsuite/gcc.target/i386/pr88070.c
===================================================================
--- testsuite/gcc.target/i386/pr88070.c (nonexistent)
+++ testsuite/gcc.target/i386/pr88070.c (working copy)
@@ -0,0 +1,12 @@
+/* PR target/88070 */
+/* { dg-do compile } */
+/* { dg-options "-O -fexpensive-optimizations -fnon-call-exceptions 
-fschedule-insns -fno-dce -fno-dse -mavx" } */
+
+typedef float vfloat2 __attribute__ ((__vector_size__ (2 * sizeof (float))));
+
+vfloat2
+test1float2 (float c)
+{
+  vfloat2 v = { c, c };
+  return v;
+}

Reply via email to