On Wed, May 27, 2020 at 6:36 AM Jiufu Guo <guoji...@linux.ibm.com> wrote: > > Segher Boessenkool <seg...@kernel.crashing.org> writes: > > > Hi! > > > > On Tue, May 26, 2020 at 08:58:13AM +0200, Richard Biener wrote: > >> On Mon, May 25, 2020 at 7:44 PM Segher Boessenkool > >> <seg...@kernel.crashing.org> wrote: > >> > Yes, cunroll does not have its own option, and that is a problem. But > >> > that is easy to fix! Either with an option, or just with params (the > >> > option wouldn't do more than set params anyway?) > >> > >> Well, given coming up with different names for essentially the same > >> transform is going to be challenging how about sth like > >> > >> -funroll-loops={early,late,static,dynamic}[insert better names here] > > > > User interface is hard :-) I think luckily we don't need to change > > anything there yet, just have an internal flag? > > > > But complete unrolling is something quite different, so it should have > > its own flag anyway (at least internally). > > > >> note there's also -fpeel-loops which may match the transform > >> done on GIMPLE better? > > > > Peeling and unrolling are the same thing, if doing complete unrolling > > (or complete peeling), followed by DCE in both cases. Peeling is a > > nicer name here I think, yeah. > > > >> I'm not sure which are the technically > >> correct terms for unrollings that elide the loop (the backedge). > > > > I don't know a better term than "complete", I don't remember ever seeing > > something else either. > > How about "Var(flag_cunroll_grow_size) EnabledBy(funroll-loops || > funroll-all-loops || fpeel-loops)" Or flag_cunroll_allow_grow_size? > > And then using this flags as: > unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size > || optimize >= 3, true); > > And we do not need to enable this flag at -O2.
Sure this works for me. Note I'd make funroll-loops enabled by funroll-all-loops so you could simplify the above. Richard. > Thanks for all your helpful comments again! > > Jiufu > > > > >> We're doing such kind of unrolling even if we cannot statically > >> decide which of a set of possible exits we take (and internally > >> call that peeling, if we can statically decide we call it complete > >> unrolling). > > > > "Peeling" is placing some copies of the loop before the loop; > > "unrolling" is placing a few copies of the loop inside the loop body. > > Does that match usage here? > > > >> The RTL side OTOH only performs classical unrolling, > >> preserving the backedge with various strategies for the > >> remaining iterations. > > > > And if you do complete unrolling that way, the backedge can be removed, > > since it can be shown never to be taken. > > > >> As said, for the regression on the 10 branch with ppc I'd add > >> [a hidden] flag that controls the RTL unroller, also set by > >> -funroll-loops and triggered by the ppc specific heuristics. > > > > But the problem is in cunroll? This is so backwards... Because some > > other transform abuses the unroller flags, adding a second level flag > > with the same meaning :-( It will work for fixing the regression, > > sure, and it is slightly less code as well. > > > > > > Segher