On Mon, Feb 09, 2026 at 10:21:15AM -0800, Andrew Pinski wrote:
> On Mon, Feb 9, 2026 at 10:13 AM Alice Carlotti <[email protected]> wrote:
> >
> > On Fri, Jan 16, 2026 at 01:25:23AM -0800, Andrew Pinski wrote:
> > > I think this is the hard approach to do this.
> >
> > Agreed now.
> >
> > > The easy approach is to return false from aarch64_use_return_insn_p if
> > > you want to disable shrink wrapping for a function.
> > > So just in aarch64_use_return_insn_p, add:
> > > if (aarch64_cfun_enables_pstate_sm ())
> > >   return false;
> >
> > This isn't quite the right place to check it, since shrink wrapping checks
> > targetm.have_simple_return, whereas the above function is called by
> > targetm.have_return.
> >
> > Instead, I have recreated the aarch64_use_simple_return_insn function that
> > existed previously until r10-3507-gce9d2a37f2db20, and put the check there.
> >
> > However, I'm now confused about the comment for the aarch64_use_return_insn
> > function - I'm not convinced this function is doing what it claims.
> >
> > >
> > > If that does not work, then let's add a target hook that is used here.
> > > Because there might be other targets that want to disable shrink
> > > wrapping based on the attributes and disabling it via the flag seems
> > > like a hack.
> >
> > This did feel like a bit of a hack, but I thought it might be in keeping 
> > with
> > some other code that seems to update optimisation information for SME
> > functions.  I also hadn't realised that the targetm.have_simple_return check
> > could already be used to inhibit shrink wrapping (and had been in the past).
> >
> > >
> > > Thanks,
> > > Andrew Pinski
> >
> >
> > I've also updated the test case.  For some reason the original test case 
> > wasn't
> > passing or failing as I intended, so I've changed the callee to use 
> > streaming
> > mode, and simplified the checks for bar.  The checks in baz are probably
> > redundant now that I'm no longer messing around with optimization flags, so 
> > I
> > could remove that if preferred.
> >
> > Is this version OK for master and backport to affected branches?
> 
> Ok for all affected open branches (which I think is GCC 13, 14, and 15).

Thanks.  Unless I miscounted, SME only arrived in GCC 14, so 13 is unaffected.

(I've pushed to master now, and will push to GCC 14+15 once I've finished
testing those.)

Alice

> 
> Thanks,
> Andrew
> 
> >
> > Thanks,
> > Alice
> >
> > ---
> >
> > The meaning of poly_int values changes depending on whether we are in
> > streaming or non-streaming mode, but this dependency is not explicitly
> > tracked.  Locally-streaming functions can change streaming state in the
> > prologue and epilogue, so it is unsafe to apply shrink wrapping to these
> > functions, as doing so could change the mode seen by instructions like
> > cntd.
> >
> > gcc/ChangeLog:
> >
> >         PR target/123624
> >         * config/aarch64/aarch64-protos.h
> >         (aarch64_use_simple_return_insn_p): New.
> >         * config/aarch64/aarch64.cc
> >         (aarch64_use_simple_return_insn_p): New, used...
> >         * config/aarch64/aarch64.md (simple_return): ...here.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR target/123624
> >         * gcc.target/aarch64/sme/sme-shrinkwrap.c: New test.
> >
> >
> > diff --git a/gcc/config/aarch64/aarch64-protos.h 
> > b/gcc/config/aarch64/aarch64-protos.h
> > index 
> > 48d3a3de2356d385fc9612b5a0c546623bbc60fe..d1f2873f208bba675aab82dcf2831da3350785c6
> >  100644
> > --- a/gcc/config/aarch64/aarch64-protos.h
> > +++ b/gcc/config/aarch64/aarch64-protos.h
> > @@ -984,6 +984,7 @@ bool aarch64_uimm12_shift (unsigned HOST_WIDE_INT);
> >  int aarch64_movk_shift (const wide_int_ref &, const wide_int_ref &);
> >  bool aarch64_is_mov_xn_imm (unsigned HOST_WIDE_INT);
> >  bool aarch64_use_return_insn_p (void);
> > +bool aarch64_use_simple_return_insn_p (void);
> >  const char *aarch64_output_casesi (rtx *);
> >  const char *aarch64_output_load_tp (rtx);
> >  const char *aarch64_output_sme_zero_za (rtx);
> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> > index 
> > 293afa52b3b38781b765ca939ed51c280313bab4..dfe1d8ab309b3483318ac9e634510288d2dd6475
> >  100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -10765,6 +10765,20 @@ aarch64_use_return_insn_p (void)
> >    return known_eq (cfun->machine->frame.frame_size, 0);
> >  }
> >
> > +/* Return false for locally streaming functions in order to avoid
> > +   shrink-wrapping them.  Shrink-wrapping is unsafe when the function 
> > prologue
> > +   and epilogue contain streaming state change, because these implicitly 
> > change
> > +   the meaning of poly_int values.  */
> > +
> > +bool
> > +aarch64_use_simple_return_insn_p (void)
> > +{
> > +  if (aarch64_cfun_enables_pstate_sm ())
> > +    return false;
> > +
> > +  return true;
> > +}
> > +
> >  /* Generate the epilogue instructions for returning from a function.
> >     This is almost exactly the reverse of the prolog sequence, except
> >     that we need to insert barriers to avoid scheduling loads that read
> > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> > index 
> > 71458bf78f5cc4d926d7c5e0467daec9a5d75a03..b7f2f42be6e30c02b74703474ce640d795bb142c
> >  100644
> > --- a/gcc/config/aarch64/aarch64.md
> > +++ b/gcc/config/aarch64/aarch64.md
> > @@ -1414,7 +1414,7 @@
> >
> >  (define_insn "simple_return"
> >    [(simple_return)]
> > -  ""
> > +  "aarch64_use_simple_return_insn_p ()"
> >    {
> >      output_asm_insn ("ret", operands);
> >      return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> > diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c 
> > b/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c
> > new file mode 100644
> > index 
> > 0000000000000000000000000000000000000000..b8dfd5f92f62f048ce14b9e2196d325e33cf5c20
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c
> > @@ -0,0 +1,78 @@
> > +/* { dg-options "-O3 -fshrink-wrap" } */
> > +/* { dg-do run { target { aarch64_sme_hw && aarch64_sve_hw } } } */
> > +/* { dg-do compile { target { ! { aarch64_sme_hw && aarch64_sve_hw } } } } 
> > */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +#include <arm_sme.h>
> > +
> > +#pragma GCC target "+sve"
> > +
> > +[[gnu::noipa]]
> > +__arm_streaming
> > +int callee (int x)
> > +{
> > +  return 0;
> > +}
> > +
> > +/*
> > +** foo:
> > +**     cbnz    w0, [^\n]*
> > +**     cntd    x0
> > +**     ret
> > +**     ...
> > +*/
> > +__arm_streaming
> > +int foo(int x)
> > +{
> > +    if (x)
> > +        return callee(3);
> > +    return svcntd();
> > +}
> > +
> > +/*
> > +** bar:
> > +**     ...
> > +**     smstart [^\n]*
> > +**     ...
> > +** (
> > +**     cntd    [^\n]*
> > +**     ...
> > +**     cbn?z   [^\n]*
> > +** |
> > +**     cbn?z   [^\n]*
> > +**     ...
> > +**     cntd    [^\n]*
> > +** )
> > +**     ...
> > +*/
> > +
> > +__arm_locally_streaming
> > +int bar(int x)
> > +{
> > +    if (x)
> > +        return callee(3);
> > +    return svcntd();
> > +}
> > +
> > +/*
> > +** baz:
> > +**     cbnz    w0, [^\n]*
> > +**     cntd    x0
> > +**     ret
> > +**     ...
> > +*/
> > +__arm_streaming
> > +int baz(int x)
> > +{
> > +    if (x)
> > +        return callee(3);
> > +    return svcntd();
> > +}
> > +
> > +[[gnu::noipa]]
> > +int main()
> > +{
> > +  if (bar(0) != svcntsd())
> > +    __builtin_abort();
> > +  return 0;
> > +}

Reply via email to