On 5/6/26 5:21 AM, Stefan Schulze Frielinghaus wrote:
From: Stefan Schulze Frielinghaus <[email protected]>

Currently an "entire" address is reloaded even in cases where section
anchors are involved.  This makes it harder to share section anchors
which is the whole point of them.  For example, in cases where
offsetable MEMs are ok do not reload .LANCHOR42+offset but only
.LANCHOR42 and replace the address with the resulting reload register
and the offset.  As a consequence subsequent passes only have to deal
with register equivalences in order to share section anchors.  For
example


I thought how to fix this in another place as LRA is already too complicated.  It could be fixed in some machined-dependent pass or in split1.  Adding the pass is overkill and fix in split1 would be ok if the target would work with memory via few insns (e.g. only via load/store insns).  So probably LRA is the best place to fix this problem.


double x;
double y;

double foo ()
{
   return x + y;
}

With this patch, after LRA we have

    20: %r1:DI=`*.LANCHOR0'
    17: %f0:DF=[%r1:DI]
    19: %r1:DI=`*.LANCHOR0'
    12: {%f0:DF=%f0:DF+[%r1:DI+0x8];clobber %cc:CC;}

and after postreload

    20: %r1:DI=`*.LANCHOR0'
    17: %f0:DF=[%r1:DI]
    12: {%f0:DF=%f0:DF+[%r1:DI+0x8];clobber %cc:CC;}

Of course, this was a lucky case since both reloads referred to the very
same register which allows for trivial removal of insn 19.  At least in
cases like demonstrated by the new test section-anchors-4.c we are
guaranteed to re-use the reload for a single insn.

Before testing this patch for multiple targets, I'm wondering whether
there is even a way to re-use reloads during LRA across insns (like an
equiv) such that we wouldn't depend on subsequent passes?

LRA reuses the reload pseudo generated for one insn (please see usage of curr_insn_input_reloads).  The problem in not reusing reload pseudo for the above example is that because reload for anchor occurs from different insns.  First LRA reloads [*.LANCHOR0] (insn 17 generated to satisfy reg constraint in insn 12), then the anchor (from insn 17 ), and then reload *.LANCHOR0 from 2nd op of insn 12.  But I'd not be worry that the reload pseudos get different hard regs.  Knowing how assigning hard regs works in LRA I see very small probability that the pseduos get different regs.

BTW I did not reproduce the testcase situation w/o -march options (the anchor in this case already in a pseudo before RA).  But -march=z13 reproduces it.  So probably you need to add this option to the test.

So the patch (with -march=z13 or other one reproducing the situation as additional option for the test) is ok for me. Thank you.

---
  gcc/lra-constraints.cc                        | 30 +++++++++++++++++++
  .../gcc.target/s390/section-anchors-4.c       | 25 ++++++++++++++++
  2 files changed, 55 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/s390/section-anchors-4.c

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index ccd68efc956..6779dfee020 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -4839,6 +4839,36 @@ curr_insn_transform (bool check_only_p)
            new_reg = emit_inc (rclass, *loc,
                                /* This value does not matter for MODIFY.  */
                                GET_MODE_SIZE (GET_MODE (op)));
+         /* Try to pull out section anchors.  For example, instead of
+            reloading an "entire" address like .LANCHOR42+offset only reload
+            .LANCHOR42 and use the new reload register as the base register.
+            This allows following optimizations to share section anchors and
+            remove redundant loads.  */
+         else if (GET_CODE (*loc) == CONST
+                  && GET_CODE (XEXP (*loc, 0)) == PLUS
+                  && GET_CODE (XEXP (XEXP (*loc, 0), 0)) == SYMBOL_REF
+                  && SYMBOL_REF_ANCHOR_P (XEXP (XEXP (*loc, 0), 0))
+                  && CONST_INT_P (XEXP (XEXP (*loc, 0), 1))
+                  /* Some offsets are valid in conjunction with a symbol and
+                     invalid in conjunction with a register.  Thus, pull out
+                     the anchor only in case the offset is a valid anchor
+                     offset.  */
+                  && INTVAL (XEXP (XEXP (*loc, 0), 1))
+                     >= targetm.min_anchor_offset
+                  && INTVAL (XEXP (XEXP (*loc, 0), 1))
+                     <= targetm.max_anchor_offset)
+           {
+              rtx anchor = XEXP (XEXP (*loc, 0), 0);
+              rtx offset = XEXP (XEXP (*loc, 0), 1);
+
+              if (get_reload_reg (OP_IN, Pmode, anchor, rclass,
+                                  NULL, false, false,
+                                  "offsetable address", &new_reg))
+                 lra_emit_move (new_reg, anchor);
+
+               new_reg = gen_rtx_PLUS (Pmode, new_reg, offset);
+               lra_assert (valid_address_p (Pmode, new_reg, MEM_ADDR_SPACE 
(op)));
+           }
          else if (get_reload_reg (OP_IN, Pmode, *loc, rclass,
                                   NULL, false, false,
                                   "offsetable address", &new_reg))
diff --git a/gcc/testsuite/gcc.target/s390/section-anchors-4.c 
b/gcc/testsuite/gcc.target/s390/section-anchors-4.c
new file mode 100644
index 00000000000..0b4cd081c61
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/section-anchors-4.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-ira-slim -fdump-rtl-reload-slim" } */
+/* { dg-final { scan-assembler-times "\tlarl\t" 1 } } */
+/* { dg-final { scan-rtl-dump 
"%cc:CCZ=cmp\\\(\\\[`\\\*.LANCHOR0'\\\],\\\[const\\\(`\\\*.LANCHOR0'\\\+0x8\\\)\\\]\\\)" 
"ira" } } */
+/* { dg-final { scan-rtl-dump 
"%cc:CCZ=cmp\\\(\\\[(%r\[1-9\]\[0-9\]?):DI\\\],\\\[\\1:DI\\\+0x8\\\]\\\)" 
"reload" } } */
+
+/* Ensure that we get the same reload register for the second memory operand.
+   Prior LRA we have
+
+   55: %cc:CCZ=cmp([`*.LANCHOR0'],[const(`*.LANCHOR0'+0x8)])
+
+   and after LRA
+
+   59: %r1:DI=`*.LANCHOR0'
+   55: %cc:CCZ=cmp([%r1:DI],[%r1:DI+0x8])  */
+
+long x, y;
+
+long
+foo (void)
+{
+  if (x == y)
+    return 42;
+  return 0;
+}

Reply via email to