https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77621

--- Comment #16 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 20 Sep 2016, ubizjak at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77621
> 
> --- Comment #15 from Uroš Bizjak <ubizjak at gmail dot com> ---
> (In reply to Richard Biener from comment #12)
> 
> > If V2DFmode moves are fine(?) then maybe not do this for the load/store
> > kinds - this means only handling vector_stmt this way (and maybe
> > vect_promote_demote?) - at least make sure to not handle scalar_*
> > (not sure if vectype is always NULL for those -- docs say only
> > memory ops may depend on vectype).
> 
> Moves are fine, V2DFmode vector arithmetic insns (addpd, subpd, mulpd) have
> much higher latencies (e.g. 6 for addpd, 9 for mulpd), comparing to their
> {SF,DF}mode (or V4SFmode) versions (1 for addps, 2 for mulps).
> 
> > Instead of += 20 I'd have done *= <factor> to
> > make it more independent of the absolute value of the cost numbers.
> 
> IMO, having no other data at hand than Agner Fog's instruction tables, it 
> looks
> that penalizing vector_stmt cost with a factor of 5 should be OK for a start.
> 
> > If you'd do the cost adjustment in ix86_add_stmt_cost you have more control
> > over the details (there's also similar offsetting for silvermont)
> 
> ix86_builtin_vectorization_cost is also called from there. OTOH,
> ix86_add_stmt_cost uses some other arguments (e.g. location), which I think 
> are
> irrelevant to the insn type cost adjustment.

At least you won't get called for the scalar loop copy and you have
definite acccess to vectype.

Reply via email to