https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154

--- Comment #29 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #27)
> (In reply to rguent...@suse.de from comment #25)
> > and STV converting single-instruction 'chains':
> > 
> > Collected chain #40... 
> >   insns: 381
> >   defs to convert: r463, r465
> > Computing gain for chain #40...
> >   Instruction gain 8 for   381: {r465:SI=smin(r463:SI,[`numBins']);clobber 
> > flags:CC;}
> >       REG_DEAD r463:SI
> >       REG_UNUSED flags:CC
> >   Instruction conversion gain: 8 
> >   Registers conversion cost: 4
> >   Total gain: 4
> > Converting chain #40...
> 
> Is this in STV1 pass? This (pre-combine) pass should be enabled only for
> TImode conversion, a semi-hack where 64bit targets convert memory access to
> TImode. General STV should not be ran before combine.

Yes, this is STV1.  My patch to enable SImode and DImode chains didn't change
where the pass runs or enable the 2nd run out of compile-time concerns.

Indeed changing this fixes the issue.  I'm going to benchmark it on
300.twolf.

> > to me the "spill" to (%rsp) looks suspicious and even more so
> > the vector(!) memory use in vpminsd.  RA could have used
> > 
> >   movd  %eax, %xmm1
> >   vpminsd %xmm1, %xmm0, %xmm1
> > 
> > no?  IRA allocates the pseudo to memory.  Testcase:
> 
> This is how IRA handles subregs. Please note, that the memory is correctly
> aligned, so vector load does not trip alignment trap. However, on x86 this
> approach triggers store forwarding stall.

Reply via email to