https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93946

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, now looking myself.  RTL expansion creates

(insn 8 7 9 2 (set (mem/j:SI (reg/v/f:SI 47 [ bv ]) [1 bv_3(D)->b.u.f+0 S4
A32])
        (reg:SI 49)) "t.c":12:13 -1
     (nil))
(insn 9 8 10 2 (set (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct aa
*)ptr_1(D)].a.u.i+0 S4 A32])
        (const_int 0 [0])) "t.c":13:12 -1
     (nil))
(insn 10 9 11 2 (set (mem/j:SI (plus:SI (reg/v/f:SI 48 [ ptr ])
                (const_int 4 [0x4])) [1 MEM[(struct aa *)ptr_1(D)].a.u.i+4 S4
A32])
        (const_int 0 [0])) "t.c":13:12 -1
     (nil))
(insn 11 10 12 2 (set (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct bb
*)ptr_1(D)].b.u.f+0 S4 A32])
        (const_int 0 [0])) "t.c":14:12 -1
     (nil))
(insn 12 11 13 2 (set (reg:SI 51)
        (mem/j:SI (reg/v/f:SI 47 [ bv ]) [1 bv_3(D)->b.u.f+0 S4 A32]))
"t.c":15:17 -1
     (nil))

where insn 11 is the important one.  Somehow on nios2 the CSE1 removes that
store.

deferring deletion of insn with uid = 11.

and we end up with

(insn 8 7 9 2 (set (mem/j:SI (reg/v/f:SI 47 [ bv ]) [1 bv_3(D)->b.u.f+0 S4
A32])
        (reg:SI 49)) "t.c":12:13 5 {movsi_internal}
     (expr_list:REG_DEAD (reg:SI 49)
        (nil)))
(insn 9 8 10 2 (set (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct aa
*)ptr_1(D)].a.u.i+0 S4 A32])
        (const_int 0 [0])) "t.c":13:12 5 {movsi_internal}
     (nil))
(insn 10 9 12 2 (set (mem/j:SI (plus:SI (reg/v/f:SI 48 [ ptr ])
                (const_int 4 [0x4])) [1 MEM[(struct aa *)ptr_1(D)].a.u.i+4 S4
A32])
        (const_int 0 [0])) "t.c":13:12 5 {movsi_internal}
     (nil))
(insn 12 10 13 2 (set (reg:SI 51 [ bv_3(D)->b.u.f ])
        (mem/j:SI (reg/v/f:SI 47 [ bv ]) [1 bv_3(D)->b.u.f+0 S4 A32]))
"t.c":15:17 5 {movsi_internal}
     (expr_list:REG_DEAD (reg/v/f:SI 47 [ bv ])
        (nil)))

where there indeed is no scheduling barrier anymore.

I didn't know CSE removes stores or why this only triggers on nios2, it looks
like some DF thing?  Backtrace of the "DSE":

#0  delete_insn (insn=0x7ffff6bc3400)
    at /space/rguenther/src/gcc/gcc/cfgrtl.c:135
#1  0x0000000000b0bfa5 in delete_insn_and_edges (insn=0x7ffff6bc3400)
    at /space/rguenther/src/gcc/gcc/cfgrtl.c:237
#2  0x0000000001a9d8eb in cse_insn (insn=0x7ffff6bc3400)
    at /space/rguenther/src/gcc/gcc/cse.c:5571
#3  0x0000000001aa0b76 in cse_extended_basic_block (ebb_data=0x7fffffffdc90)
    at /space/rguenther/src/gcc/gcc/cse.c:6614
#4  0x0000000001aa10a5 in cse_main (f=0x7ffff6cce310, nregs=52)
    at /space/rguenther/src/gcc/gcc/cse.c:6793

that's

      /* Similarly for no-op moves.  */
      else if (noop_insn)
        {
          if (cfun->can_throw_non_call_exceptions && can_throw_internal (insn))
            cse_cfg_altered = true;
          cse_cfg_altered |= delete_insn_and_edges (insn);
          /* No more processing for this set.  */
          sets[i].rtl = 0;

so appearantly it does redundant store removal as well...

          /* Similarly, lots of targets don't allow no-op
             (set (mem x) (mem x)) moves.  Even (set (reg x) (reg x))
             might be impossible for certain registers (like CC registers).  */
          else if (n_sets == 1
                   && !CALL_P (insn)
                   && (MEM_P (trial) || REG_P (trial))
                   && rtx_equal_p (trial, dest)
                   && !side_effects_p (dest)
                   && (cfun->can_delete_dead_exceptions
                       || insn_nothrow_p (insn)))
            {
              SET_SRC (sets[i].rtl) = trial;
              noop_insn = true;
              break;
            }

where

(gdb) p debug_rtx (insn)
(insn 11 10 12 2 (set (mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct bb
*)ptr_1(D)].b.u.f+0 S4 A32])
        (const_int 0 [0])) "t.c":14:12 5 {movsi_internal}
     (expr_list:REG_DEAD (reg/v/f:SI 48 [ ptr ])
        (nil)))
(gdb) p debug_rtx (trial)
(mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct bb *)ptr_1(D)].b.u.f+0 S4
A32])
$4 = void
(gdb) p debug_rtx (dest)
(mem/j:SI (reg/v/f:SI 48 [ ptr ]) [1 MEM[(struct bb *)ptr_1(D)].b.u.f+0 S4
A32])
$6 = void

so it might be that the trigger is a target where sizeof(long long) = 2 *
sizeof(long) _and_ we split stores to the larger type
(I tried to pick a set of types where sizeof is the same but
alias-sets are different - otherwise I'd have to cater for big vs.
little-endian).

Reply via email to