Hi!

On Tue, Nov 30, 2021 at 01:05:48PM +0800, Kewen.Lin wrote:
> on 2021/11/30 上午6:06, Segher Boessenkool wrote:
> > On Tue, Sep 28, 2021 at 04:16:04PM +0800, Kewen.Lin wrote:
> >>     unsigned adjusted_cost = (nunits == 2) ? 2 : 1;
> >>     unsigned extra_cost = nunits * adjusted_cost;
> > 
> >> For V2DI/V2DF, it uses 2 penalized cost for each scalar load
> >> while for the other modes, it uses 1.
> > 
> > So for V2D[IF] we get 4, for V4S[IF] we get 4, for V8HI it's 8, and
> > for V16QI it is 16?  Pretty terrible as well, heh (I would expect all
> > vector ops to be similar cost).
> 
> But for different vector units it has different number of loads, it seems
> reasonable to have more costs when it has more loads to be fed into those
> limited number of load/store units.

More expensive, yes.  This expensive?  That doesn't look optimal :-)

> > This also suggests we should cost vector construction separately, which
> > would pretty obviously be a good thing anyway (it happens often, it has
> > a quite different cost structure).
> 
> vectorizer does model vector construction separately, there is an enum
> vect_cost_for_stmt *vec_construct*, normally it works well.  But for this
> bwaves hotspot, it requires us to do some more penalization as evaluated,
> so we put the penalized cost onto this special vector construction when
> some heuristic thresholds are met.

Ah, heuristics.  We can adjust them forever :-)


Segher

Reply via email to