Hi Honza, On Sat, 12 Jan 2019 at 19:32, Jan Hubicka <hubi...@ucw.cz> wrote: > > Hello, > this patch sets inline-unit-growth to 40. The performance changes are > - Firefox, LTO > > https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=f7bd026e1a931b9a284d1c85c2577a72dd592820&newProject=try&newRevision=74889968abcc688b8d161863566ed273c0401ee4&framework=1&filter=opt&showOnlyComparable=1&showOnlyImportant=1 > After fixes to inlining priorities this makes difference without > profile feedback only. > > Code size growth is about 9.15% with LTO and 3.95 with LTO and profile > feedback. > - Firefox noLTO > > https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c902b72340a3dca3114f58578c1c8f3e6a1cd89c&newProject=try&newRevision=4974da6f92c144a9c09765b56a564a640069ddb9&framework=1&showOnlyComparable=1&showOnlyImportant=1 > With about 7% code size growth > - SPEC > > https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?num_runs=10&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f > - C++ benchmarks > > https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?num_runs=10&all_changes=on&min_percentage_change=0.02&revisions=46e2bd1143b5c60af814916d7673879b34ceb3f6%2Cc0d79cfe9c4ec30823480f2f9b256600e8e3899f > > I am not entirely happy about the code-size/performance tradeoffs but it > is concerned only for programs built with -O3 or having too many inline > keywords. I have looked into inlining decisions for Firefox, HHVM and > Clang and inliner gets out of growt bounds way too early and some of > more performance aware projects already sets the limit up. > > I will tune other metrics down to handle some of the code size problems. > > Honza > > Index: ChangeLog > =================================================================== > --- ChangeLog (revision 267882) > +++ ChangeLog (working copy) > @@ -1,3 +1,7 @@ > +2019-01-05 Jan Hubicka <hubi...@ucw.cz> > + > + * params.def (inline-unit-growth): Set to 40. > + > 2019-01-12 Jakub Jelinek <ja...@redhat.com> > > * tree-ssa-loop-ivopts.c (find_inv_vars): Fix a comment typo. > Index: params.def > =================================================================== > --- params.def (revision 267882) > +++ params.def (working copy) > @@ -227,7 +227,7 @@ DEFPARAM(PARAM_LARGE_UNIT_INSNS, > DEFPARAM(PARAM_INLINE_UNIT_GROWTH, > "inline-unit-growth", > "How much can given compilation unit grow because of the inlining > (in percent).", > - 20, 0, 0) > + 40, 0, 0) > DEFPARAM(PARAM_IPCP_UNIT_GROWTH, > "ipcp-unit-growth", > "How much can given compilation unit grow because of the > interprocedural constant propagation (in percent).",
This patch introduces a regression in libstdc++: FAIL: ext/pb_ds/regression/list_update_map_rand.cc execution test on a few arm targets. For instance: arm-none-linux-gnueabihf --with-mode arm --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16 Using --with-mode thumb and the same other configure options makes the test pass. I'm seeing this with other configurations --with-mode arm and --with-fpu vfp* (as opposed to neon*) The .log file has: <?xml version = "1.0"?> <test> <sd value = "1547347280"> </sd> <n value = "50"> </n> <m value = "10"> </m> <tp value = "0.2"> </tp> <ip value = "0.6"> </ip> <ep value = "0.2"> </ep> <cp value = "0.001"> </cp> <mp value = "0.25"> </mp> </test> <cntnr name = "lu_mtf_map"> <desc> <type value = "list_update"> <Update_Policy value = "lu_move_to_front_policy"> </Update_Policy> </type> </desc> <progress> ---------------------------------------- ************qemu: uncaught target signal 11 (Segmentation fault) - core dumped The (incomplete?) qemu execution trace ends with: ---------------- IN: 0x40ada6b8: e5910008 ldr r0, [r1, #8] 0x40ada6bc: e1560000 cmp r6, r0 0x40ada6c0: 1a00004f bne #0x40ada804 ---------------- IN: 0x40ada6c4: e5960004 ldr r0, [r6, #4] 0x40ada6c8: e582100c str r1, [r2, #0xc] 0x40ada6cc: e3500c02 cmp r0, #0x200 0x40ada6d0: e5812008 str r2, [r1, #8] 0x40ada6d4: 3a000002 blo #0x40ada6e4 ---------------- IN: 0x40adb880: eaffff3e b #0x40adb580 ---------------- IN: _ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE 0x0001ffc4: e3a06000 mov r6, #0 0x0001ffc8: eaffff88 b #0x1fdf0 ---------------- IN: _ZN10__gnu_pbds4test6detail30container_rand_regression_testINS_11list_updateINS0_10basic_typeES4_St8equal_toIS4_ENS0_26lu_move_to_front_policy_t_EN9__gnu_cxx22throw_allocator_randomIS4_EEEEE13subscript_impENSt3tr117integral_constantIiLi0EEE 0x0001fdf0: ee180a10 vmov r0, s16 0x0001fdf4: ebffcd12 bl #0x13244 Christophe