Re: Enable inliner to bypass inline-insns-single/auto when it knows the performance will improve

Jan Hubicka Wed, 07 Nov 2012 02:39:19 -0800

> On Wed, Nov 7, 2012 at 10:40 AM, Jan Hubicka wrote:
> > Hi,
> > with inliner predicates, the inliner heuristic now is able to prove that
> > some of the inlined function body will be optimized out after inlining.
> > This makes it possible to estimate the speedup that is now used to drive
> > the badness metric, but it is ignored in actual decision whether function
> > is inline candidate.
> 
> Is it really still the time for this kind of changes? Development
> stage3 means "regression fixes only" and this isn't a regression...


I discussed this with Jakub/Richi that I would like to do inliner heuristic
re-tunning at early stage 3. This is part of it.  I am hoping to be done soon.
While the changes was done a while ago, I am pushing them out slowly so they
can be indenpendently benchmarked.  I am not able to do too many SPEC2k6 runs
in a week. 

I had bit hard time getting inliner to the level of 4.7 on Mozilla LTO and
tramp3d that are both hard to analyze.  This turned out to be mostly the
addr_expr issues. We no longer forward propagate as much as we did to keep info
for objsize pass this made a lot of C++ abstraction to be no longer zero cost.
Also there was the stupid overflow on time metric making some inlining
copletely random.  Inliner seem to be in relatively good shape performance wise
getting quite consistent improvements in C++. (tramp3d is 50% smaller and
faster than before, wave and DLV also improved in both code size and speed,
Mozilla is faster &smaller and we now get smaller code from -Os than -O2 on the
C++ stuff, LTO SPEC builds got smaller with same speed,
http://gcc.opensuse.org/c++bench-frescobaldi/).

Overall plan I plan to add one extra inliner hint for array indexes to help
fortran array descriptors and enable use of the gcov's histograms that Google
apparently forgot to do (that is FDO only). So if I will wait today to see
effect of ipa-cp change probably going in, I should be done by Saturday (at
speed of patch a day).  

Next week I plan to run some benchmarks to see if the inlining limits can be
pushed down a bit, but it does not seem to be critical.  Pushing overall growth
to 15% or less would make wonders for Firefox with LTO (that probably won't
matter much in practice since we are impracticaly slow and memory hungry at
WPA), reducing inline-insns-auto/single may work given that we can now bypass
it in cases that matter. Neither one is too critical however.

I also still need to analyze botan regression that is only left on the table
(not neccesarily inliner related) for x86 and see if the IA-64 regresisons are
inliner related or something else.  There is also EON regression at -O2 that
seems to be related to unrolling heuristic decision. It seem to reproduce on
AMD hardware only so it may be simple code layout problem.

Plan also look into the comple time regression with large number of callees in
single function. (one of the old Lucier's PRs). This can be fixed by
incrementally updating the call statement costs as edges are added/removed
instead of recomputing them from scratch.

Honza
> 
> Ciao!
> Steven

Re: Enable inliner to bypass inline-insns-single/auto when it knows the performance will improve

Reply via email to