On Wed, 3 Jun 2020, Kewen.Lin wrote:

> Hi Richi,
> 
> on 2020/6/2 下午7:38, Richard Biener wrote:
> > On Thu, 28 May 2020, Kewen.Lin wrote:
> > 
> >> Hi,
> >>
> >> This is one repost and you can refer to the original series 
> >> via https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538360.html.
> >>
> >> As we discussed in the thread
> >> https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00196.html
> >> Original: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00104.html,
> >> I'm working to teach IVOPTs to consider D-form group access during 
> >> unrolling.
> >> The difference on D-form and other forms during unrolling is we can put the
> >> stride into displacement field to avoid additional step increment. eg:
> >>
> >> With X-form (uf step increment):
> >>   ...
> >>   LD A = baseA, X
> >>   LD B = baseB, X
> >>   ST C = baseC, X
> >>   X = X + stride
> >>   LD A = baseA, X
> >>   LD B = baseB, X
> >>   ST C = baseC, X
> >>   X = X + stride
> >>   LD A = baseA, X
> >>   LD B = baseB, X
> >>   ST C = baseC, X
> >>   X = X + stride
> >>   ...
> >>
> >> With D-form (one step increment for each base):
> >>   ...
> >>   LD A = baseA, OFF
> >>   LD B = baseB, OFF
> >>   ST C = baseC, OFF
> >>   LD A = baseA, OFF+stride
> >>   LD B = baseB, OFF+stride
> >>   ST C = baseC, OFF+stride
> >>   LD A = baseA, OFF+2*stride
> >>   LD B = baseB, OFF+2*stride
> >>   ST C = baseC, OFF+2*stride
> >>   ...
> >>   baseA += stride * uf
> >>   baseB += stride * uf
> >>   baseC += stride * uf
> >>
> >> Imagining that if the loop get unrolled by 8 times, then 3 step updates 
> >> with
> >> D-form vs. 8 step updates with X-form. Here we only need to check stride
> >> meet D-form field requirement, since if OFF doesn't meet, we can construct
> >> baseA' with baseA + OFF.
> > 
> > I'd just mention there are other targets that have the choice between
> > the above forms.  Since IVOPTs itself does not perform the unrolling
> > the IL it produces is the same, correct?
> > 
> Yes.  Before this patch, IVOPTs doesn't consider the unrolling impacts,
> it only models things based on what it sees.  We can assume it thinks
> later RTL unrolling won't perform.
> 
> With this patch, since the IV choice probably changes, the IL can probably
> change.  The typical difference with this patch is:
> 
>   vect__1.7_15 = MEM[symbol: x, index: ivtmp.19_22, offset: 0B];
> vs.
>   vect__1.7_15 = MEM[base: _29, offset: 0B];

So we're asking IVOPTS "if we were unrolling this loop would you make
a different IV choice?" thus I wonder why we need so much complexity
here?  That is, if we can classify the loop as being possibly unrolled
we could evaluate IVOPTs IV choice (and overall cost) on the original
loop and in a second run on the original loop with fake IV uses
added with extra offset.  If the overall IV cost is similar we'll
take the unroll friendly choice if the costs are way different
(I wouldn't expect this to be the case ever?) I'd side with the
IV choice when not unrolling (and mark the loop as to be not unrolled).

Thus I'd err on the side of not unrolling but leave the ultimate choice
of whether to unroll to RTL unless IV cost makes that prohibitive.

Even without X- or D- form addressing modes the IV choice may differ
and I think we don't need extra knobs for the unroller but instead
can decide to set the existing n_unroll to zero (force not unroll)
when costs say it would be bad?

Richard.

> BR,
> Kewen
> 
> > Richard.
> > 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to