On Sat, Sep 10, 2016 at 8:04 AM, Yuan, Pengfei <[email protected]> wrote:
> Hi,
>
> Previously I have sent a patch on profile based option tuning:
> https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01377.html
>
> According to Richard Biener's advice, I try investigating where the code size
> reduction comes from. After analyzing the dumped IL, I figure out that it is
> related to function inlining. Some cold functions are inlined regardless of
> profile feedback, which increases code size.
>
> The problem is with the early inliner. In want_early_inline_function_p, if the
> estimated edge growth > 0, want_inline depends on maybe_hot_p, which usually
> returns true unless optimize_size, since profile feedback is not available at
> this point. Some functions which may be cold according to profile feedback are
> inlined regardlessly, resulting in code size increase.
>
> At first, I come up with a solution that preloads some profile info before
> pass_early_inline. But it fails with numerous coverage-mismatch errors in
> pass_ipa_tree_profile. Therefore, the proposed patch prevents early inlining
> with positive code size growth if FDO is enabled.
>
> Experiment results are as follows:
>
> Setup
> Hardware Core i7-4770, 32GB RAM
> OS Debian sid amd64
> Compiler GCC 5.4.1 20160907
> Firefox source mozilla-central, cset 91c2b9d5c135
> Training workload css3test.com, html5test.com, Octane benchmark
>
> Vanilla GCC
> Code size (.text of libxul.so) 48708873
> Octane benchmark (score) 35828 36618 35847
> Kraken benchmark (time) 939.4ms 964.0ms 951.8ms
>
> Patched GCC
> Code size (.text of libxul.so) 44686265
> Octane benchmark (score) 36103 35740 35611
> Kraken benchmark (time) 928.9ms 949.1ms 938.7ms
>
> There is over 8% reduction in code size, while no obvious difference in
> performance. The experiment is conducted with GCC 5. There is segmentation
> fault when starting Firefox instrumented by GCC 6. GCC 7 encounters ICE when
> building Firefox.
I think the approach is reasonable though it might still have
interesting effects on
optimization for very small growth. So for further experimenting it
would be nice
to have a separate PARAM_EARLY_FDO_INLINING_INSNS or maybe simply
adjust the PARAM_EARLY_INLINING_INSNS default accordingly when FDO is
enabled?
I'll let Honza also double-check the condition detecting FDO (it looks
like we should
have some abstraction for that).
Thanks,
Richard.
> Regards,
>
> Yuan, Pengfei
>
>
> gcc/ChangeLog:
> * ipa-inline.c (want_early_inline_function_p): Be more conservative
> if FDO is enabled.
>
>
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index 7097cf3..8266f97 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -628,6 +628,20 @@ want_early_inline_function_p (struct cgraph_edge *e)
>
> if (growth <= 0)
> ;
> + /* Profile feedback is not available at this point.
> + Be more conservative if FDO is enabled. */
> + else if ((profile_arc_flag && !flag_test_coverage)
> + || (flag_branch_probabilities && !flag_auto_profile))
> + {
> + if (dump_file)
> + fprintf (dump_file, " will not early inline: %s/%i->%s/%i, "
> + "FDO is enabled and code would grow by %i\n",
> + xstrdup_for_dump (e->caller->name ()),
> + e->caller->order,
> + xstrdup_for_dump (callee->name ()), callee->order,
> + growth);
> + want_inline = false;
> + }
> else if (!e->maybe_hot_p ()
> && growth > 0)
> {
>