The fix for PR117868.

In brief:
this is an LRA bug derived from reuse spilling slots after frame pointer 
spilling.
The slot was created for QImode (1 byte) and it was reused after spilling of the
frame pointer for TImode register (16 bytes long) and it overlaps other slots.

Wrong things happened while `lra_spill ()'
---------------------------- part of lra-spills.cc ----------------------------
  n = assign_spill_hard_regs (pseudo_regnos, n);
  slots_num = 0;
  assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);  <--- first call 
---
  for (i = 0; i < n; i++)
    if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
      assign_mem_slot (pseudo_regnos[i]);
  if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
    {
      /* Assign stack slots to spilled pseudos assigned to fp.  */
      assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);  <--- second 
call ---
      for (i = 0; i < n2; i++)
        if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
          assign_mem_slot (pseudo_regnos[i]);
    }
------------------------------------------------------------------------------

In a first call of `assign_stack_slot_num_and_sort_pseudos(...)' LRA allocates 
slot #17
for r93 (QImode - 1 byte).
In a second call of `assign_stack_slot_num_and_sort_pseudos(...)' LRA reuse 
slot #17 for
r114 (TImode - 16 bytes).
It's wrong. We can't reuse 1 byte slot #17 for 16 bytes register.


Details:

After IRA pass we have:
------------------ part of simd-t.c.319r.ira ----------------------
(insn 269 268 270 8 (set (subreg:QI (reg:TI 114 [ _116 ]) 6)
        (xor:QI (reg:QI 66 [ _36 ])
            (reg:QI 67 [ _37 ]))) 631 {xorqi3}
     (expr_list:REG_DEAD (reg:QI 67 [ _37 ])
        (expr_list:REG_DEAD (reg:QI 66 [ _36 ])
            (nil))))
(insn 270 269 271 8 (set (subreg:QI (reg:TI 114 [ _116 ]) 7)
        (xor:QI (reg:QI 69 [ _39 ])
            (reg:QI 70 [ _40 ]))) 631 {xorqi3}
     (expr_list:REG_DEAD (reg:QI 70 [ _40 ])
        (expr_list:REG_DEAD (reg:QI 69 [ _39 ])
            (nil))))
-------------------------------------------------------------------

While LRA spilling:
------------------ part of simd-t.c.320r.reload -------------------
      Creating newreg=348 from oldreg=66, assigning class GENERAL_REGS to r348
  269: r348:QI=r348:QI^r67:QI
      REG_DEAD r67:QI
      REG_DEAD r66:QI
    Inserting insn reload before:
  543: r348:QI=r66:QI
    Inserting insn reload after:
  544: r114:TI#6=r348:QI

[...]

      Choosing alt 0 in insn 270:  (0) =r  (1) %0  (2) r {xorqi3}
      Creating newreg=349 from oldreg=69, assigning class GENERAL_REGS to r349
      Creating newreg=350 from oldreg=70, assigning class GENERAL_REGS to r350
  270: r349:QI=r349:QI^r350:QI
      REG_DEAD r70:QI
      REG_DEAD r69:QI
    Inserting insn reload before:
  545: r349:QI=r69:QI
  547: r350:QI=r70:QI
    Inserting insn reload after:
  546: r114:TI#7=r349:QI
-------------------------------------------------------------------


After LRA pass:
------------------ part of simd-t.c.320r.reload -------------------
(insn 543 542 269 11 (set (reg:QI 14 r14 [orig:66 _36 ] [66])
        (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 7 [0x7])) [4 %sfp+7 S1 A8])) 113 {movqi_insn_split}
     (nil))
(insn 269 543 544 11 (set (reg:QI 14 r14 [orig:66 _36 ] [66])
        (xor:QI (reg:QI 14 r14 [orig:66 _36 ] [66])
            (reg:QI 2 r2 [orig:67 _37 ] [67]))) 631 {xorqi3}
     (nil))
(insn 544 269 545 11 (set (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 24 [0x18])) [4 %sfp+24 S1 A8])
        (reg:QI 14 r14 [orig:66 _36 ] [66])) 113 {movqi_insn_split}
     (nil))
(insn 545 544 547 11 (set (reg:QI 14 r14 [orig:69 _39 ] [69])
        (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 24 [0x18])) [4 %sfp+24 S1 A8])) 113 
{movqi_insn_split}
     (nil))
(insn 547 545 270 11 (set (reg:QI 13 r13 [orig:70 _40 ] [70])
        (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 8 [0x8])) [4 %sfp+8 S1 A8])) 113 {movqi_insn_split}
     (nil))
(insn 270 547 546 11 (set (reg:QI 14 r14 [orig:69 _39 ] [69])
        (xor:QI (reg:QI 14 r14 [orig:69 _39 ] [69])
            (reg:QI 13 r13 [orig:70 _40 ] [70]))) 631 {xorqi3}
     (nil))
(insn 546 270 548 11 (set (mem/c:QI (plus:HI (reg/f:HI 28 r28)
                (const_int 25 [0x19])) [4 %sfp+25 S1 A8])
        (reg:QI 14 r14 [orig:69 _39 ] [69])) 113 {movqi_insn_split}
     (nil))
-------------------------------------------------------------------

Simplified insns before LRA:
----------------------------------
insn 269 r114.6  =  r66 ^ r67
insn 270 r114.7  =  r69 ^ r70
----------------------------------

after LRA:
----------------------------------
insn 543 r14 {r66} = [sfp+7]    # reload insn
insn 269 r14 {r66} ^= r2 {r67}
insn 544 [sfp+24] = r14 {r66}   # reload insn -- bug is here !

insn 545 r14 {r69} = [sfp+24]   # reload insn
insn 547 r13 {r70} = [sfp+8]    # reload insn
insn 270 r14 {r69} ^= r13 {r70}
insn 546 [sfp+25] = r14 {r69}   # reload insn
----------------------------------

The bug appears in insn 544.
It is a spill address `[sfp+24]' for pseudo r66 which is equal to another
slot address from insn 544 for pseudo r69.

The problem is here:
------------------ part of simd-t.c.320r.reload -------------------
  Slot 17 regnos (width = 0):    93      114
-------------------------------------------------------------------
Where:
r93 is a QImode register
r114 is a TImode register (occupies 16 registers on AVR target !)

It's not a problem between r93 and r114 (live ranges of r93 and r114 didn't 
intersect).
There is an issue with other spill slots because r114 is a large register that 
overlaps
with others.

Wrong things happened while `lra_spill ()':
---------------------------- part of lra-spills.cc ----------------------------
  n = assign_spill_hard_regs (pseudo_regnos, n);
  slots_num = 0;
  assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);  <--- first call 
---
  for (i = 0; i < n; i++)
    if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
      assign_mem_slot (pseudo_regnos[i]);
  if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
    {
      /* Assign stack slots to spilled pseudos assigned to fp.  */
      assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);  <--- second 
call ---
      for (i = 0; i < n2; i++)
        if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
          assign_mem_slot (pseudo_regnos[i]);
    }
------------------------------------------------------------------------------

In a first call of `assign_stack_slot_num_and_sort_pseudos(...)' LRA allocates 
slot #17
for r93 (QImode - 1 byte).
In a second call of `assign_stack_slot_num_and_sort_pseudos(...)' LRA reuse 
slot #17 for
r114 (TImode - 16 bytes).
It's wrong. We can't reuse 1 byte slot #17 for 16 bytes register.

The code in patch does reuse slots only without allocated memory or only with 
equal or
smaller registers with equal or smaller alignment.
Also, a small fix for debugging output of slot width.
Print slot size as width, not a 0 as a size of (mem/c:BLK (...)).

On x86_64, it bootstraps+regtests fine.

Ok for trunk ?


Denis.


        PR rtl-optimization/117868
gcc/
        * lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Reuse slots
        only without allocated memory or only with equal or smaller registers
        with equal or smaller alignment.
        (lra_spill): Print slot size as width.


diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
index db78dcd28a3..93a0c92db9f 100644
--- a/gcc/lra-spills.cc
+++ b/gcc/lra-spills.cc
@@ -386,7 +386,18 @@ assign_stack_slot_num_and_sort_pseudos (int 
*pseudo_regnos, int n)
                && ! (lra_intersected_live_ranges_p
                      (slots[j].live_ranges,
                       lra_reg_info[regno].live_ranges)))
-             break;
+             {
+               /* A slot without allocated memory can be shared.  */
+               if (slots[j].mem == NULL_RTX)
+                 break;
+
+               /* A slot with allocated memory can be shared only with equal
+                  or smaller register with equal or smaller alignment.  */
+               if (slots[j].align >= spill_slot_alignment (mode)
+                   && compare_sizes_for_sort (slots[j].size,
+                                              GET_MODE_SIZE (mode)) != -1)
+                 break;
+             }
        }
       if (j >= slots_num)
        {
@@ -656,8 +667,7 @@ lra_spill (void)
       for (i = 0; i < slots_num; i++)
        {
          fprintf (lra_dump_file, "  Slot %d regnos (width = ", i);
-         print_dec (GET_MODE_SIZE (GET_MODE (slots[i].mem)),
-                    lra_dump_file, SIGNED);
+         print_dec (slots[i].size, lra_dump_file, SIGNED);
          fprintf (lra_dump_file, "):");
          for (curr_regno = slots[i].regno;;
               curr_regno = pseudo_slots[curr_regno].next - pseudo_slots)

Reply via email to