> On Wed, Nov 7, 2012 at 10:40 AM, Jan Hubicka wrote: > > Hi, > > with inliner predicates, the inliner heuristic now is able to prove that > > some of the inlined function body will be optimized out after inlining. > > This makes it possible to estimate the speedup that is now used to drive > > the badness metric, but it is ignored in actual decision whether function > > is inline candidate. > > Is it really still the time for this kind of changes? Development > stage3 means "regression fixes only" and this isn't a regression...
I discussed this with Jakub/Richi that I would like to do inliner heuristic re-tunning at early stage 3. This is part of it. I am hoping to be done soon. While the changes was done a while ago, I am pushing them out slowly so they can be indenpendently benchmarked. I am not able to do too many SPEC2k6 runs in a week. I had bit hard time getting inliner to the level of 4.7 on Mozilla LTO and tramp3d that are both hard to analyze. This turned out to be mostly the addr_expr issues. We no longer forward propagate as much as we did to keep info for objsize pass this made a lot of C++ abstraction to be no longer zero cost. Also there was the stupid overflow on time metric making some inlining copletely random. Inliner seem to be in relatively good shape performance wise getting quite consistent improvements in C++. (tramp3d is 50% smaller and faster than before, wave and DLV also improved in both code size and speed, Mozilla is faster &smaller and we now get smaller code from -Os than -O2 on the C++ stuff, LTO SPEC builds got smaller with same speed, http://gcc.opensuse.org/c++bench-frescobaldi/). Overall plan I plan to add one extra inliner hint for array indexes to help fortran array descriptors and enable use of the gcov's histograms that Google apparently forgot to do (that is FDO only). So if I will wait today to see effect of ipa-cp change probably going in, I should be done by Saturday (at speed of patch a day). Next week I plan to run some benchmarks to see if the inlining limits can be pushed down a bit, but it does not seem to be critical. Pushing overall growth to 15% or less would make wonders for Firefox with LTO (that probably won't matter much in practice since we are impracticaly slow and memory hungry at WPA), reducing inline-insns-auto/single may work given that we can now bypass it in cases that matter. Neither one is too critical however. I also still need to analyze botan regression that is only left on the table (not neccesarily inliner related) for x86 and see if the IA-64 regresisons are inliner related or something else. There is also EON regression at -O2 that seems to be related to unrolling heuristic decision. It seem to reproduce on AMD hardware only so it may be simple code layout problem. Plan also look into the comple time regression with large number of callees in single function. (one of the old Lucier's PRs). This can be fixed by incrementally updating the call statement costs as edges are added/removed instead of recomputing them from scratch. Honza > > Ciao! > Steven