From: "Cui, Lili" <lili....@intel.com> Hi,
This patch aims to remove issue code in improve_allocation that was causing expensive allocno overflows. Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? Thanks, Lili. The original intention of this code was to allow more allocnos to share the same register, but this led to expensive allocno overflows. Extracted a small case (a bit large, see Bugzilla PR117838 for details) from 548.exchange2_r to analyze this register allocation issue. Before improve_allocation function: a537 (cost 1896, reg42) a20 (cost 270, reg1) a13 (cost 144, spill) a551 (cost 70, reg40) a5 (cost 43, spill) a493 (cost 30, reg42) a499 (cost 12, reg40) ------------------------------ Dump info in improve_allocation function: Base: Spilling a493r125 for a5r113 Spilling a573r202 for a5r113 Spilling a499r248 for a13r106 Spilling a551r120 for a13r106 Spilling a20r237 for a551r120 With patch: Spilling a499r248 for a13r106 Spilling a551r120 for a13r106 Spilling a493r125 for a551r120 ------------------------------ After assign_hard_reg (at the end of improve_allocation): Base: a537 (cost 1896, reg1) a20 (cost 270, spill) -----> This is unreasonable a13 (cost 144, reg40) a551 (cost 70, reg1) a5 (cost 43, reg42) a493 (cost 30, spill) a499 (cost 12, reg1) With patch: a537 (cost 1896, reg42) a20 (cost 270, reg1) a13 (cost 144, reg40) a551 (cost 70, reg42) a5 (cost 43, spill) a493 (cost 30, spill) a499 (cost 12, reg42) ----------------------------- Collected spec2017 performance on Znver3/Graviton4/EMR/SRF for O2 and Ofast. No performance regression was observed. FOR multi-copy O2 SRF: 548.exchange2_r increased by 7.5%, 500.perlbench_r increased by 2.0%. EMR: 548.exchange2_r increased by 4.5%, 500.perlbench_r increased by 1.7%. Graviton4: 548.exchange2_r Increased by 2.2%, 511.povray_r increased by 2.8%. Znver3 : 500.perlbench_r increased by 2.0%. gcc/ChangeLog: PR rtl-optimization/117838 * ira-color.cc (improve_allocation): Remove the issue code. --- gcc/ira-color.cc | 41 +++++++++++------------------------------ 1 file changed, 11 insertions(+), 30 deletions(-) diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc index 4b9296029cc..fa2ea61cadf 100644 --- a/gcc/ira-color.cc +++ b/gcc/ira-color.cc @@ -3304,8 +3304,6 @@ improve_allocation (void) assigning hard register to allocno A even without spilling conflicting allocnos. */ continue; - auto_bitmap allocnos_to_spill; - HARD_REG_SET soft_conflict_regs = {}; mode = ALLOCNO_MODE (a); nwords = ALLOCNO_NUM_OBJECTS (a); /* Process each allocno conflicting with A and update the cost @@ -3331,40 +3329,24 @@ improve_allocation (void) ALLOCNO_COLOR_DATA (conflict_a)->temp = check; if ((conflict_hregno = ALLOCNO_HARD_REGNO (conflict_a)) < 0) continue; - auto spill_a = ira_soft_conflict (a, conflict_a); - if (spill_a) - { - if (!bitmap_set_bit (allocnos_to_spill, - ALLOCNO_NUM (spill_a))) - continue; - ira_loop_border_costs border_costs (spill_a); - spill_cost = border_costs.spill_inside_loop_cost (); - } + spill_cost = ALLOCNO_UPDATED_MEMORY_COST (conflict_a); + k = (ira_class_hard_reg_index + [ALLOCNO_CLASS (conflict_a)][conflict_hregno]); + ira_assert (k >= 0); + if ((allocno_costs = ALLOCNO_HARD_REG_COSTS (conflict_a)) + != NULL) + spill_cost -= allocno_costs[k]; else - { - spill_cost = ALLOCNO_UPDATED_MEMORY_COST (conflict_a); - k = (ira_class_hard_reg_index - [ALLOCNO_CLASS (conflict_a)][conflict_hregno]); - ira_assert (k >= 0); - if ((allocno_costs = ALLOCNO_HARD_REG_COSTS (conflict_a)) - != NULL) - spill_cost -= allocno_costs[k]; - else - spill_cost -= ALLOCNO_UPDATED_CLASS_COST (conflict_a); - spill_cost - += allocno_copy_cost_saving (conflict_a, conflict_hregno); - } + spill_cost -= ALLOCNO_UPDATED_CLASS_COST (conflict_a); + spill_cost + += allocno_copy_cost_saving (conflict_a, conflict_hregno); conflict_nregs = hard_regno_nregs (conflict_hregno, ALLOCNO_MODE (conflict_a)); auto note_conflict = [&](int r) { if (check_hard_reg_p (a, r, conflicting_regs, profitable_hard_regs)) - { - if (spill_a) - SET_HARD_REG_BIT (soft_conflict_regs, r); - costs[r] += spill_cost; - } + costs[r] += spill_cost; }; for (r = conflict_hregno; r >= 0 && (int) end_hard_regno (mode, r) > conflict_hregno; @@ -3396,7 +3378,6 @@ improve_allocation (void) by spilling some conflicting allocnos does not improve the allocation cost. */ continue; - spill_soft_conflicts (a, allocnos_to_spill, soft_conflict_regs, best); nregs = hard_regno_nregs (best, mode); /* Now spill conflicting allocnos which contain a hard register of A when we assign the best chosen hard register to it. */ -- 2.34.1