On Mon, Feb 9, 2026 at 10:13 AM Alice Carlotti <[email protected]> wrote:
>
> On Fri, Jan 16, 2026 at 01:25:23AM -0800, Andrew Pinski wrote:
> > I think this is the hard approach to do this.
>
> Agreed now.
>
> > The easy approach is to return false from aarch64_use_return_insn_p if
> > you want to disable shrink wrapping for a function.
> > So just in aarch64_use_return_insn_p, add:
> > if (aarch64_cfun_enables_pstate_sm ())
> >   return false;
>
> This isn't quite the right place to check it, since shrink wrapping checks
> targetm.have_simple_return, whereas the above function is called by
> targetm.have_return.
>
> Instead, I have recreated the aarch64_use_simple_return_insn function that
> existed previously until r10-3507-gce9d2a37f2db20, and put the check there.
>
> However, I'm now confused about the comment for the aarch64_use_return_insn
> function - I'm not convinced this function is doing what it claims.
>
> >
> > If that does not work, then let's add a target hook that is used here.
> > Because there might be other targets that want to disable shrink
> > wrapping based on the attributes and disabling it via the flag seems
> > like a hack.
>
> This did feel like a bit of a hack, but I thought it might be in keeping with
> some other code that seems to update optimisation information for SME
> functions.  I also hadn't realised that the targetm.have_simple_return check
> could already be used to inhibit shrink wrapping (and had been in the past).
>
> >
> > Thanks,
> > Andrew Pinski
>
>
> I've also updated the test case.  For some reason the original test case 
> wasn't
> passing or failing as I intended, so I've changed the callee to use streaming
> mode, and simplified the checks for bar.  The checks in baz are probably
> redundant now that I'm no longer messing around with optimization flags, so I
> could remove that if preferred.
>
> Is this version OK for master and backport to affected branches?

Ok for all affected open branches (which I think is GCC 13, 14, and 15).

Thanks,
Andrew

>
> Thanks,
> Alice
>
> ---
>
> The meaning of poly_int values changes depending on whether we are in
> streaming or non-streaming mode, but this dependency is not explicitly
> tracked.  Locally-streaming functions can change streaming state in the
> prologue and epilogue, so it is unsafe to apply shrink wrapping to these
> functions, as doing so could change the mode seen by instructions like
> cntd.
>
> gcc/ChangeLog:
>
>         PR target/123624
>         * config/aarch64/aarch64-protos.h
>         (aarch64_use_simple_return_insn_p): New.
>         * config/aarch64/aarch64.cc
>         (aarch64_use_simple_return_insn_p): New, used...
>         * config/aarch64/aarch64.md (simple_return): ...here.
>
> gcc/testsuite/ChangeLog:
>
>         PR target/123624
>         * gcc.target/aarch64/sme/sme-shrinkwrap.c: New test.
>
>
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> 48d3a3de2356d385fc9612b5a0c546623bbc60fe..d1f2873f208bba675aab82dcf2831da3350785c6
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -984,6 +984,7 @@ bool aarch64_uimm12_shift (unsigned HOST_WIDE_INT);
>  int aarch64_movk_shift (const wide_int_ref &, const wide_int_ref &);
>  bool aarch64_is_mov_xn_imm (unsigned HOST_WIDE_INT);
>  bool aarch64_use_return_insn_p (void);
> +bool aarch64_use_simple_return_insn_p (void);
>  const char *aarch64_output_casesi (rtx *);
>  const char *aarch64_output_load_tp (rtx);
>  const char *aarch64_output_sme_zero_za (rtx);
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 293afa52b3b38781b765ca939ed51c280313bab4..dfe1d8ab309b3483318ac9e634510288d2dd6475
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -10765,6 +10765,20 @@ aarch64_use_return_insn_p (void)
>    return known_eq (cfun->machine->frame.frame_size, 0);
>  }
>
> +/* Return false for locally streaming functions in order to avoid
> +   shrink-wrapping them.  Shrink-wrapping is unsafe when the function 
> prologue
> +   and epilogue contain streaming state change, because these implicitly 
> change
> +   the meaning of poly_int values.  */
> +
> +bool
> +aarch64_use_simple_return_insn_p (void)
> +{
> +  if (aarch64_cfun_enables_pstate_sm ())
> +    return false;
> +
> +  return true;
> +}
> +
>  /* Generate the epilogue instructions for returning from a function.
>     This is almost exactly the reverse of the prolog sequence, except
>     that we need to insert barriers to avoid scheduling loads that read
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 
> 71458bf78f5cc4d926d7c5e0467daec9a5d75a03..b7f2f42be6e30c02b74703474ce640d795bb142c
>  100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1414,7 +1414,7 @@
>
>  (define_insn "simple_return"
>    [(simple_return)]
> -  ""
> +  "aarch64_use_simple_return_insn_p ()"
>    {
>      output_asm_insn ("ret", operands);
>      return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c 
> b/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..b8dfd5f92f62f048ce14b9e2196d325e33cf5c20
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sme/sme-shrinkwrap.c
> @@ -0,0 +1,78 @@
> +/* { dg-options "-O3 -fshrink-wrap" } */
> +/* { dg-do run { target { aarch64_sme_hw && aarch64_sve_hw } } } */
> +/* { dg-do compile { target { ! { aarch64_sme_hw && aarch64_sve_hw } } } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +#include <arm_sme.h>
> +
> +#pragma GCC target "+sve"
> +
> +[[gnu::noipa]]
> +__arm_streaming
> +int callee (int x)
> +{
> +  return 0;
> +}
> +
> +/*
> +** foo:
> +**     cbnz    w0, [^\n]*
> +**     cntd    x0
> +**     ret
> +**     ...
> +*/
> +__arm_streaming
> +int foo(int x)
> +{
> +    if (x)
> +        return callee(3);
> +    return svcntd();
> +}
> +
> +/*
> +** bar:
> +**     ...
> +**     smstart [^\n]*
> +**     ...
> +** (
> +**     cntd    [^\n]*
> +**     ...
> +**     cbn?z   [^\n]*
> +** |
> +**     cbn?z   [^\n]*
> +**     ...
> +**     cntd    [^\n]*
> +** )
> +**     ...
> +*/
> +
> +__arm_locally_streaming
> +int bar(int x)
> +{
> +    if (x)
> +        return callee(3);
> +    return svcntd();
> +}
> +
> +/*
> +** baz:
> +**     cbnz    w0, [^\n]*
> +**     cntd    x0
> +**     ret
> +**     ...
> +*/
> +__arm_streaming
> +int baz(int x)
> +{
> +    if (x)
> +        return callee(3);
> +    return svcntd();
> +}
> +
> +[[gnu::noipa]]
> +int main()
> +{
> +  if (bar(0) != svcntsd())
> +    __builtin_abort();
> +  return 0;
> +}

Reply via email to