https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95018
--- Comment #23 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #20)
> (In reply to Jiu Fu Guo from comment #18)
> > Currently, I'm thinking to enhance GCC 'cunroll' as:
> > if the loop has multi-exits or upbound is not a fixed number, we may not do
> > 'complete unroll' for the loop, except -funroll-all-loops is specified.
>
> That doens't make much sense (-funroll-all-loops is RTL unroller only).
>
> I think the growth limits are simply too large unless we compute a "win"
> which we in this case do not. So I'd say the growth limits should scale
> with win ^ (1/new param) thus if we estimate to eliminate 20% of the
> loop stmts due to unrolling then the limit to apply is
> limit * (0.2 ^ (1/X)) with X maybe defaulting to 2.
>
> I'd only apply this new limit for peeling (peeling is when the loop count
> is not constant and thus we keep the exit tests).
>
> Of course people want more peeling (hello POWER people!)
Btw, the issue with the rs6000 code at present is that it uses
unroll_only_small_loops but that only affects the RTL unroller
while the enablement of -funroll-loops at -O2 affects GIMPLE
as well but unconstrained (with -O3 params). For the main
unroll pass (not cunrolli) this triggers code size growth:
unsigned int val = tree_unroll_loops_completely (flag_unroll_loops
|| flag_peel_loops
|| optimize >= 3, true);
the "original" patch also adjusted parameters. If the intent is to only
affect the RTL unroller then we need a separate flag controlling it
(yeah, using the same flags as heuristic trigger was probably bad).