Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

Richard Biener via Gcc-patches Wed, 27 May 2020 01:21:25 -0700

On Wed, May 27, 2020 at 6:36 AM Jiufu Guo <guoji...@linux.ibm.com> wrote:
>
> Segher Boessenkool <seg...@kernel.crashing.org> writes:
>
> > Hi!
> >
> > On Tue, May 26, 2020 at 08:58:13AM +0200, Richard Biener wrote:
> >> On Mon, May 25, 2020 at 7:44 PM Segher Boessenkool
> >> <seg...@kernel.crashing.org> wrote:
> >> > Yes, cunroll does not have its own option, and that is a problem.  But
> >> > that is easy to fix!  Either with an option, or just with params (the
> >> > option wouldn't do more than set params anyway?)
> >>
> >> Well, given coming up with different names for essentially the same
> >> transform is going to be challenging how about sth like
> >>
> >> -funroll-loops={early,late,static,dynamic}[insert better names here]
> >
> > User interface is hard :-)  I think luckily we don't need to change
> > anything there yet, just have an internal flag?
> >
> > But complete unrolling is something quite different, so it should have
> > its own flag anyway (at least internally).
> >
> >> note there's also -fpeel-loops which may match the transform
> >> done on GIMPLE better?
> >
> > Peeling and unrolling are the same thing, if doing complete unrolling
> > (or complete peeling), followed by DCE in both cases.  Peeling is a
> > nicer name here I think, yeah.
> >
> >> I'm not sure which are the technically
> >> correct terms for unrollings that elide the loop (the backedge).
> >
> > I don't know a better term than "complete", I don't remember ever seeing
> > something else either.
>
> How about "Var(flag_cunroll_grow_size) EnabledBy(funroll-loops ||
> funroll-all-loops || fpeel-loops)" Or flag_cunroll_allow_grow_size?
>
> And then using this flags as:
>   unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size
>                                                    || optimize >= 3, true);
>
> And we do not need to enable this flag at -O2.


Sure this works for me.  Note I'd make funroll-loops enabled by
funroll-all-loops so you could simplify the above.

Richard.

> Thanks for all your helpful comments again!
>
> Jiufu
>
> >
> >> We're doing such kind of unrolling even if we cannot statically
> >> decide which of a set of possible exits we take (and internally
> >> call that peeling, if we can statically decide we call it complete
> >> unrolling).
> >
> > "Peeling" is placing some copies of the loop before the loop;
> > "unrolling" is placing a few copies of the loop inside the loop body.
> > Does that match usage here?
> >
> >> The RTL side OTOH only performs classical unrolling,
> >> preserving the backedge with various strategies for the
> >> remaining iterations.
> >
> > And if you do complete unrolling that way, the backedge can be removed,
> > since it can be shown never to be taken.
> >
> >> As said, for the regression on the 10 branch with ppc I'd add
> >> [a hidden] flag that controls the RTL unroller, also set by
> >> -funroll-loops and triggered by the ppc specific heuristics.
> >
> > But the problem is in cunroll?  This is so backwards...  Because some
> > other transform abuses the unroller flags, adding a second level flag
> > with the same meaning :-(  It will work for fixing the regression,
> > sure, and it is slightly less code as well.
> >
> >
> > Segher

Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

Reply via email to