On Fri, Jan 16, 2026 at 01:25:23AM -0800, Andrew Pinski wrote:
> I think this is the hard approach to do this.
Agreed now.
> The easy approach is to return false from aarch64_use_return_insn_p if
> you want to disable shrink wrapping for a function.
> So just in aarch64_use_return_insn_p, add:
> if (aarch64_cfun_enables_pstate_sm ())
> return false;
This isn't quite the right place to check it, since shrink wrapping checks
targetm.have_simple_return, whereas the above function is called by
targetm.have_return.
Instead, I have recreated the aarch64_use_simple_return_insn function that
existed previously until r10-3507-gce9d2a37f2db20, and put the check there.
However, I'm now confused about the comment for the aarch64_use_return_insn
function - I'm not convinced this function is doing what it claims.
>
> If that does not work, then let's add a target hook that is used here.
> Because there might be other targets that want to disable shrink
> wrapping based on the attributes and disabling it via the flag seems
> like a hack.
This did feel like a bit of a hack, but I thought it might be in keeping with
some other code that seems to update optimisation information for SME
functions. I also hadn't realised that the targetm.have_simple_return check
could already be used to inhibit shrink wrapping (and had been in the past).
>
> Thanks,
> Andrew Pinski
I've also updated the test case. For some reason the original test case wasn't
passing or failing as I intended, so I've changed the callee to use streaming
mode, and simplified the checks for bar. The checks in baz are probably
redundant now that I'm no longer messing around with optimization flags, so I
could remove that if preferred.
Is this version OK for master and backport to affected branches?
Thanks,
Alice
---
The meaning of poly_int values changes depending on whether we are in
streaming or non-streaming mode, but this dependency is not explicitly
tracked. Locally-streaming functions can change streaming state in the
prologue and epilogue, so it is unsafe to apply shrink wrapping to these
functions, as doing so could change the mode seen by instructions like
cntd.
gcc/ChangeLog:
PR target/123624
* config/aarch64/aarch64-protos.h
(aarch64_use_simple_return_insn_p): New.
* config/aarch64/aarch64.cc
(aarch64_use_simple_return_insn_p): New, used...
* config/aarch64/aarch64.md (simple_return): ...here.
gcc/testsuite/ChangeLog:
PR target/123624
* gcc.target/aarch64/sme/sme-shrinkwrap.c: New test.
diff --git a/gcc/config/aarch64/aarch64-protos.h
b/gcc/config/aarch64/aarch64-protos.h
index
48d3a3de2356d385fc9612b5a0c546623bbc60fe..d1f2873f208bba675aab82dcf2831da3350785c6
100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -984,6 +984,7 @@ bool aarch64_uimm12_shift (unsigned HOST_WIDE_INT);
int aarch64_movk_shift (const wide_int_ref &, const wide_int_ref &);
bool aarch64_is_mov_xn_imm (unsigned HOST_WIDE_INT);
bool aarch64_use_return_insn_p (void);
+bool aarch64_use_simple_return_insn_p (void);
const char *aarch64_output_casesi (rtx *);
const char *aarch64_output_load_tp (rtx);
const char *aarch64_output_sme_zero_za (rtx);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index
293afa52b3b38781b765ca939ed51c280313bab4..dfe1d8ab309b3483318ac9e634510288d2dd6475
100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -10765,6 +10765,20 @@ aarch64_use_return_insn_p (void)
return known_eq (cfun->machine->frame.frame_size, 0);
}
+/* Return false for locally streaming functions in order to avoid
+ shrink-wrapping them. Shrink-wrapping is unsafe when the function prologue
+ and epilogue contain streaming state change, because these implicitly change
+ the meaning of poly_int values. */
+
+bool
+aarch64_use_simple_return_insn_p (void)
+{
+ if (aarch64_cfun_enables_pstate_sm ())
+ return false;
+
+ return true;
+}
+
/* Generate the epilogue instructions for returning from a function.
This is almost exactly the reverse of the prolog sequence, except
that we need to insert barriers to avoid scheduling loads that read
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index
71458bf78f5cc4d926d7c5e0467daec9a5d75a03..b7f2f42be6e30c02b74703474ce640d795bb142c
100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1414,7 +1414,7 @@
(define_insn "simple_return"
[(simple_return)]
- ""
+ "aarch64_use_simple_return_insn_p ()"
{
output_asm_insn ("ret", operands);
return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c
b/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c
new file mode 100644
index
0000000000000000000000000000000000000000..b8dfd5f92f62f048ce14b9e2196d325e33cf5c20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c
@@ -0,0 +1,78 @@
+/* { dg-options "-O3 -fshrink-wrap" } */
+/* { dg-do run { target { aarch64_sme_hw && aarch64_sve_hw } } } */
+/* { dg-do compile { target { ! { aarch64_sme_hw && aarch64_sve_hw } } } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include <arm_sme.h>
+
+#pragma GCC target "+sve"
+
+[[gnu::noipa]]
+__arm_streaming
+int callee (int x)
+{
+ return 0;
+}
+
+/*
+** foo:
+** cbnz w0, [^\n]*
+** cntd x0
+** ret
+** ...
+*/
+__arm_streaming
+int foo(int x)
+{
+ if (x)
+ return callee(3);
+ return svcntd();
+}
+
+/*
+** bar:
+** ...
+** smstart [^\n]*
+** ...
+** (
+** cntd [^\n]*
+** ...
+** cbn?z [^\n]*
+** |
+** cbn?z [^\n]*
+** ...
+** cntd [^\n]*
+** )
+** ...
+*/
+
+__arm_locally_streaming
+int bar(int x)
+{
+ if (x)
+ return callee(3);
+ return svcntd();
+}
+
+/*
+** baz:
+** cbnz w0, [^\n]*
+** cntd x0
+** ret
+** ...
+*/
+__arm_streaming
+int baz(int x)
+{
+ if (x)
+ return callee(3);
+ return svcntd();
+}
+
+[[gnu::noipa]]
+int main()
+{
+ if (bar(0) != svcntsd())
+ __builtin_abort();
+ return 0;
+}