On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote:
> > >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
> > >   Ok for trunk?
> >
> > Some time ago a support for CLOBBER_HIGH RTX was added (and later
> > removed for some reason). Perhaps we could resurrect the patch for the
> > purpose of ferrying 128bit modes via vzeroupper RTX?
> 
> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html

https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html
is where it got removed, CCing Richard.

> > +(define_split
> > +  [(match_parallel 0 "vzeroupper_pattern"
> > +     [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])]
> > +  "TARGET_AVX && ix86_pre_reload_split ()"
> > +  [(match_dup 0)]
> > +{
> > +  /* When vzeroupper is explictly used, for LRA purpose, make it clear
> > +     the instruction kills sse registers.  */
> > +  gcc_assert (cfun->machine->has_explicit_vzeroupper);
> > +  unsigned int nregs = TARGET_64BIT ? 16 : 8;
> > +  rtvec vec = rtvec_alloc (nregs + 1);
> > +  RTVEC_ELT (vec, 0) = gen_rtx_UNSPEC_VOLATILE (VOIDmode,
> > +                        gen_rtvec (1, const1_rtx),
> > +                        UNSPECV_VZEROUPPER);
> > +  for (unsigned int i = 0; i < nregs; ++i)
> > +    {
> > +      unsigned int regno = GET_SSE_REGNO (i);
> > +      rtx reg = gen_rtx_REG (V2DImode, regno);
> > +      RTVEC_ELT (vec, i + 1) = gen_rtx_CLOBBER (VOIDmode, reg);
> > +    }
> > +  operands[0] = gen_rtx_PARALLEL (VOIDmode, vec);
> > +})
> >
> > Wouldn't this also kill lower 128bit values that are not touched by
> > vzeroupper? A CLOBBER_HIGH would be more appropriate here.

Yes, it would.  But normally the only xmm* hard regs live across the
explicit user vzeroupper would be local and global register variables,
I think the 1st scheduler etc. shouldn't extend lifetime of the
xmm hard regs across UNSPEC_VOLATILE.

        Jakub

Reply via email to