On 2022-01-06 09:48, Richard Sandiford wrote:
This patch looks for allocno conflicts of the following form:

- One allocno (X) is a cap allocno for some non-cap allocno X2.
- X2 belongs to some loop L2.
- The other allocno (Y) is a non-cap allocno.
- Y is an ancestor of some allocno Y2 in L2.
- Y2 is not referenced in L2 (that is, ALLOCNO_NREFS (Y2) == 0).
- Y can use a different allocation from Y2.

In this case, Y's register is live across L2 but is not used within it,
whereas X's register is used only within L2.  The conflict is therefore
only "soft", in that it can easily be avoided by spilling Y2 inside L2
without affecting any insn references.

In principle we could do this for ALLOCNO_NREFS (Y2) != 0 too, with the
callers then taking Y2's ALLOCNO_MEMORY_COST into account.  There would
then be no "cliff edge" between a Y2 that has no references and a Y2 that
has (say) a single cold reference.

However, doing that isn't necessary for the PR and seems to give
variable results in practice.  (fotonik3d_r improves slightly but
namd_r regresses slightly.)  It therefore seemed better to start
with the higher-value zero-reference case and see how things go.

On top of the previous patches in the series, this fixes the exchange2
regression seen in GCC 11.

gcc/
        PR rtl-optimization/98782
        * ira-int.h (ira_soft_conflict): Declare.
        * ira-costs.c (max_soft_conflict_loop_depth): New constant.
        (ira_soft_conflict): New function.
        (spill_soft_conflicts): Likewise.
        (assign_hard_reg): Use them to handle the case described by
        the comment above ira_soft_conflict.
        (improve_allocation): Likewise.
        * ira.c (check_allocation): Allow allocnos with "soft" conflicts
        to share the same register.

gcc/testsuite/
        * gcc.target/aarch64/reg-alloc-4.c: New test.

OK.  If something goes wrong with the patches (e.g. a lot of GCC testsuite failures or performance degradation), we can revert only the last 3 of them as ones actually changing the heuristics.  But I hope it will be not necessary.

Thank you again for working on the PR.  Fixing it required big efforts in thinking, testing and benchmarking.


Reply via email to