> > The patch requires bit of testsuite changes > > - I disabled ch in loop-unswitch-17.c since it tests unswitching of > > loop invariant conditional. > > - pr103079.c needs ch disabled to trigger vrp situation it tests for > > (otherwise we optimize stuff earlier and better) > > - copy-headers-7.c now gets only 2 basic blocks duplicated since > > last conditional does not seem to benefit from duplicating, > > so I reordered them. > > copy-headers-9 tests the new logic. > > > > Bootstrapped/regtested x86_64-linux, OK? > > OK. In case the size heuristics are a bit too optimistic we could avoid the Thanks! > peeling in the -Os case? Did you do any stats on TUs to see whether code > actually increases in the end?
I did only stats on tramp3d and some GCC source files with -O2 where the new heuristics actually tends to duplicate fewer BBs overall because of the logic stopping the duplication chain after last winning header while the prevoious implementation keeps rolling loop more. Difference is small (sub 1%) since most loops are very simple and have only one header BB to duplicate. We however handle more loops overall and produce more do-whiles. I think there is some potential in getting heuristics more speculative now and allowing more partial peeling, but the code right now is still on safe side. For -Os we set code growth limit to 0 so we only duplicate if we know that one of the two copies will be optimized out. This is more strict than we did previously and I need to get more stats on this - we may want to bump up the limit or at least increase it to account the extra jump saved with while -> do-while conversion. Honza