Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > On Fri, May 14, 2021 at 10:27 AM Hongtao Liu <crazy...@gmail.com> wrote: >> >> On Thu, May 13, 2021 at 7:52 PM Richard Sandiford >> <richard.sandif...@arm.com> wrote: >> > >> > Jakub Jelinek <ja...@redhat.com> writes: >> > > On Thu, May 13, 2021 at 12:32:26PM +0100, Richard Sandiford wrote: >> > >> Jakub Jelinek <ja...@redhat.com> writes: >> > >> > On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote: >> > >> >> > > Bootstrapped and regtested on X86_64-linux-gnu{-m32,} >> > >> >> > > Ok for trunk? >> > >> >> > >> > >> >> > Some time ago a support for CLOBBER_HIGH RTX was added (and later >> > >> >> > removed for some reason). Perhaps we could resurrect the patch for >> > >> >> > the >> > >> >> > purpose of ferrying 128bit modes via vzeroupper RTX? >> > >> >> >> > >> >> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html >> > >> > >> > >> > https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html >> > >> > is where it got removed, CCing Richard. >> > >> >> > >> Yeah. Initially clobber_high seemed like the best appraoch for >> > >> handling the tlsdesc thing, but in practice it was too difficult >> > >> to shoe-horn the concept in after the fact, when so much rtl >> > >> infrastructure wasn't prepared to deal with it. The old support >> > >> didn't handle all cases and passes correctly, and handled others >> > >> suboptimally. >> > >> >> > >> I think it would be worth using the same approach as >> > >> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01466.html for >> > >> vzeroupper: represent the instructions as call_insns in which the >> > >> call has a special vzeroupper ABI. I think that's likely to lead >> > >> to better code than clobber_high would (or at least, it did for >> > >> tlsdesc). >> >> From an implementation perspective, I guess you're meaning we should >> implement TARGET_INSN_CALLEE_ABI and TARGET_FNTYPE_ABI in the i386 >> backend. >> > When I implemented the vzeroupper pattern as call_insn and defined > TARGET_INSN_CALLEE_ABI for it, I got several failures. they're related > to 2 parts > > 1. requires_stack_frame_p return true for vzeroupper which should be false. > 2. in subst_stack_regs, vzeroupper shouldn't kill arguments > > I've tried a rough patch like below, it works for those failures, > unfortunately, I don't have an arm machine to test, so I want to ask > would the below change break something in the arm backend?
ABI id 0 just means the default ABI. Real calls can use other ABIs besides the default. That said… > modified gcc/reg-stack.c > @@ -174,6 +174,7 @@ > #include "reload.h" > #include "tree-pass.h" > #include "rtl-iter.h" > +#include "function-abi.h" > > #ifdef STACK_REGS > > @@ -2385,7 +2386,7 @@ subst_stack_regs (rtx_insn *insn, stack_ptr regstack) > bool control_flow_insn_deleted = false; > int i; > > - if (CALL_P (insn)) > + if (CALL_P (insn) && insn_callee_abi (insn).id () == 0) > { > int top = regstack->top; …reg-stack.c is effectively x86-specific code, so checking id 0 here wouldn't affect anything else. It doesn't feel very future-proof though, since x86 could use ABIs other than 0 for real calls in future. AIUI the property that matters here isn't the ABI, but that the target of the call doesn't reference stack registers. That can be true for real calls too, with -fipa-ra. > modified gcc/shrink-wrap.c > @@ -58,7 +58,12 @@ requires_stack_frame_p (rtx_insn *insn, > HARD_REG_SET prologue_used, > unsigned regno; > > if (CALL_P (insn)) > - return !SIBLING_CALL_P (insn); > + { > + if (insn_callee_abi (insn).id() != 0) > + return false; > + else > + return !SIBLING_CALL_P (insn); > + } TBH I'm not sure why off-hand this function needs to treat non-sibling calls specially, rather than rely on normal DF information. Calls have a use of the stack pointer, so we should return true for that reason: /* The stack ptr is used (honorarily) by a CALL insn. */ df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i], NULL, bb, insn_info, DF_REF_REG_USE, DF_REF_CALL_STACK_USAGE | flags); I guess this is something we should suppress for fake calls though. It looks like the rtx “used” flag is unused for INSNs, so we could use that as a CALL_INSN flag that indicates a fake call. We could just need to make: /* For all other RTXes clear the used flag on the copy. */ RTX_FLAG (copy, used) = 0; conditional on !INSN_P. Thanks, Richard