https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125880

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 22 Jun 2026, liuhongt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125880
> 
> --- Comment #7 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> (In reply to Hongtao Liu from comment #6)
> > > For the cases above the code comes from the vec_init expander but I can
> > > imagine this might be too early for a perfect decision.
> > 
> > it comes from ix86_expand_vector_init_interleave which use SImode for
> > V*HI/V*QImode for vec_init_0.
> >
> 
> By the time in ix86_exand_vector_init, we don't know if the source is from
> memory or gpr.
> - for memory, pinsrw/pinsrb probably is a win
> - For register, pinsrw/pinsrb from r32 should be worse than vmovd for port
> pressure on Intel-P core, but ok for E-core. For Zen: pinsr* is 2u vs 1u
> (latency-equal-ish); Zen5 gives pinsr great TP (0.25) but vmovd is still fewer
> uops.

Yes, as said RTL expansion is likely to early.  We'd want some kind of
peephole/splitter or an extension to STV?  Ideally saving the GPR
use before RA.

Reply via email to