> > The patch requires bit of testsuite changes
> >  - I disabled ch in loop-unswitch-17.c since it tests unswitching of
> >    loop invariant conditional.
> >  - pr103079.c needs ch disabled to trigger vrp situation it tests for
> >    (otherwise we optimize stuff earlier and better)
> >  - copy-headers-7.c now gets only 2 basic blocks duplicated since
> >    last conditional does not seem to benefit from duplicating,
> >    so I reordered them.
> > copy-headers-9 tests the new logic.
> >
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> OK.  In case the size heuristics are a bit too optimistic we could avoid the
Thanks!
> peeling in the -Os case?  Did you do any stats on TUs to see whether code
> actually increases in the end?

I did only stats on tramp3d and some GCC source files with -O2 where the
new heuristics actually tends to duplicate fewer BBs overall because of
the logic stopping the duplication chain after last winning header while
the prevoious implementation keeps rolling loop more.  Difference is
small (sub 1%) since most loops are very simple and have only one header
BB to duplicate.  We however handle more loops overall and produce more
do-whiles.

I think there is some potential in getting heuristics more speculative
now and allowing more partial peeling, but the code right now is still
on safe side.

For -Os we set code growth limit to 0 so we only duplicate if we know
that one of the two copies will be optimized out.  This is more strict
than we did previously and I need to get more stats on this - we may
want to bump up the limit or at least increase it to account the extra
jump saved with while -> do-while conversion.

Honza

Reply via email to