On Sat, Sep 10, 2016 at 8:04 AM, Yuan, Pengfei <y...@pku.edu.cn> wrote: > Hi, > > Previously I have sent a patch on profile based option tuning: > https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01377.html > > According to Richard Biener's advice, I try investigating where the code size > reduction comes from. After analyzing the dumped IL, I figure out that it is > related to function inlining. Some cold functions are inlined regardless of > profile feedback, which increases code size. > > The problem is with the early inliner. In want_early_inline_function_p, if the > estimated edge growth > 0, want_inline depends on maybe_hot_p, which usually > returns true unless optimize_size, since profile feedback is not available at > this point. Some functions which may be cold according to profile feedback are > inlined regardlessly, resulting in code size increase. > > At first, I come up with a solution that preloads some profile info before > pass_early_inline. But it fails with numerous coverage-mismatch errors in > pass_ipa_tree_profile. Therefore, the proposed patch prevents early inlining > with positive code size growth if FDO is enabled. > > Experiment results are as follows: > > Setup > Hardware Core i7-4770, 32GB RAM > OS Debian sid amd64 > Compiler GCC 5.4.1 20160907 > Firefox source mozilla-central, cset 91c2b9d5c135 > Training workload css3test.com, html5test.com, Octane benchmark > > Vanilla GCC > Code size (.text of libxul.so) 48708873 > Octane benchmark (score) 35828 36618 35847 > Kraken benchmark (time) 939.4ms 964.0ms 951.8ms > > Patched GCC > Code size (.text of libxul.so) 44686265 > Octane benchmark (score) 36103 35740 35611 > Kraken benchmark (time) 928.9ms 949.1ms 938.7ms > > There is over 8% reduction in code size, while no obvious difference in > performance. The experiment is conducted with GCC 5. There is segmentation > fault when starting Firefox instrumented by GCC 6. GCC 7 encounters ICE when > building Firefox.
I think the approach is reasonable though it might still have interesting effects on optimization for very small growth. So for further experimenting it would be nice to have a separate PARAM_EARLY_FDO_INLINING_INSNS or maybe simply adjust the PARAM_EARLY_INLINING_INSNS default accordingly when FDO is enabled? I'll let Honza also double-check the condition detecting FDO (it looks like we should have some abstraction for that). Thanks, Richard. > Regards, > > Yuan, Pengfei > > > gcc/ChangeLog: > * ipa-inline.c (want_early_inline_function_p): Be more conservative > if FDO is enabled. > > > diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c > index 7097cf3..8266f97 100644 > --- a/gcc/ipa-inline.c > +++ b/gcc/ipa-inline.c > @@ -628,6 +628,20 @@ want_early_inline_function_p (struct cgraph_edge *e) > > if (growth <= 0) > ; > + /* Profile feedback is not available at this point. > + Be more conservative if FDO is enabled. */ > + else if ((profile_arc_flag && !flag_test_coverage) > + || (flag_branch_probabilities && !flag_auto_profile)) > + { > + if (dump_file) > + fprintf (dump_file, " will not early inline: %s/%i->%s/%i, " > + "FDO is enabled and code would grow by %i\n", > + xstrdup_for_dump (e->caller->name ()), > + e->caller->order, > + xstrdup_for_dump (callee->name ()), callee->order, > + growth); > + want_inline = false; > + } > else if (!e->maybe_hot_p () > && growth > 0) > { >