Hi Honza,

On Sat, 12 Jan 2019 at 19:32, Jan Hubicka <hubi...@ucw.cz> wrote:
>
> Hello,
> this patch sets inline-unit-growth to 40.  The performance changes are
> - Firefox, LTO
>   
> https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=f7bd026e1a931b9a284d1c85c2577a72dd592820&newProject=try&newRevision=74889968abcc688b8d161863566ed273c0401ee4&framework=1&filter=opt&showOnlyComparable=1&showOnlyImportant=1
>   After fixes to inlining priorities this makes difference without
>   profile feedback only.
>
>   Code size growth is about 9.15% with LTO and 3.95 with LTO and profile
>   feedback.
> - Firefox noLTO
>   
> https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c902b72340a3dca3114f58578c1c8f3e6a1cd89c&newProject=try&newRevision=4974da6f92c144a9c09765b56a564a640069ddb9&framework=1&showOnlyComparable=1&showOnlyImportant=1
>   With about 7% code size growth
> - SPEC
>   
> https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?num_runs=10&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f
> - C++ benchmarks
>   
> https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?num_runs=10&all_changes=on&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f
>
> I am not entirely happy about the code-size/performance tradeoffs but it
> is concerned only for programs built with -O3 or having too many inline
> keywords.  I have looked into inlining decisions for Firefox, HHVM and
> Clang and inliner gets out of growt bounds way too early and some of
> more performance aware projects already sets the limit up.
>
> I will tune other metrics down to handle some of the code size problems.
>
> Honza
>
> Index: ChangeLog
> ===================================================================
> --- ChangeLog   (revision 267882)
> +++ ChangeLog   (working copy)
> @@ -1,3 +1,7 @@
> +2019-01-05  Jan Hubicka  <hubi...@ucw.cz>
> +
> +       * params.def (inline-unit-growth): Set to 40.
> +
>  2019-01-12  Jakub Jelinek  <ja...@redhat.com>
>
>         * tree-ssa-loop-ivopts.c (find_inv_vars): Fix a comment typo.
> Index: params.def
> ===================================================================
> --- params.def  (revision 267882)
> +++ params.def  (working copy)
> @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS,
>  DEFPARAM(PARAM_INLINE_UNIT_GROWTH,
>          "inline-unit-growth",
>          "How much can given compilation unit grow because of the inlining 
> (in percent).",
> -        20, 0, 0)
> +        40, 0, 0)
>  DEFPARAM(PARAM_IPCP_UNIT_GROWTH,
>          "ipcp-unit-growth",
>          "How much can given compilation unit grow because of the 
> interprocedural constant propagation (in percent).",

This patch introduces a regression in libstdc++:
FAIL: ext/pb_ds/regression/list_update_map_rand.cc execution test
on a few arm targets.

For instance:
arm-none-linux-gnueabihf
--with-mode arm
--with-cpu cortex-a5
--with-fpu vfpv3-d16-fp16

Using --with-mode thumb and the same other configure options makes the
test pass.
I'm seeing this with other configurations --with-mode arm and
--with-fpu vfp* (as opposed to neon*)

The .log file has:
<?xml version = "1.0"?>
<test>
<sd value = "1547347280">
</sd>
<n value = "50">
</n>
<m value = "10">
</m>
<tp value = "0.2">
</tp>
<ip value = "0.6">
</ip>
<ep value = "0.2">
</ep>
<cp value = "0.001">
</cp>
<mp value = "0.25">
</mp>
</test>
<cntnr name = "lu_mtf_map">
<desc>
<type value = "list_update">
<Update_Policy value = "lu_move_to_front_policy">
</Update_Policy>
</type>
</desc>
<progress>
----------------------------------------
************qemu: uncaught target signal 11 (Segmentation fault) - core dumped

The (incomplete?) qemu execution trace ends with:
----------------
IN:
0x40ada6b8:  e5910008  ldr      r0, [r1, #8]
0x40ada6bc:  e1560000  cmp      r6, r0
0x40ada6c0:  1a00004f  bne      #0x40ada804
----------------
IN:
0x40ada6c4:  e5960004  ldr      r0, [r6, #4]
0x40ada6c8:  e582100c  str      r1, [r2, #0xc]
0x40ada6cc:  e3500c02  cmp      r0, #0x200
0x40ada6d0:  e5812008  str      r2, [r1, #8]
0x40ada6d4:  3a000002  blo      #0x40ada6e4
----------------
IN:
0x40adb880:  eaffff3e  b        #0x40adb580
----------------
IN: 
_ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE
0x0001ffc4:  e3a06000  mov      r6, #0
0x0001ffc8:  eaffff88  b        #0x1fdf0
----------------
IN: 
_ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE
0x0001fdf0:  ee180a10  vmov     r0, s16
0x0001fdf4:  ebffcd12  bl       #0x13244

Christophe

Reply via email to