On Sun, Oct 19, 2014 at 1:19 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >> > >> > I am surprised you hit the size limits with 4.9 only - for quite some time >> > we keep all virtual functions in callgarph until inlining. In fact 4.9 is >> > first >> > that works harder to drop them early (because I hit the problem with LTO >> > where they artifically bloat the size of LTO object files) >> >> We can dig it more to later understand why only 4.9 hits the problem. > > This would be very interesting, because 4.9 ought to be better here > (removing more virtuals early) than previous compilers. > There was number of changes in 4.9 that may affect this - some fixes at > C++ side giving middle end more inline candidates and also this change > https://gcc.gnu.org/ml/gcc-patches/2013-01/msg00834.html > So perhaps most of your virtual functions are !COMDAT&&!EXTERNAL, but that > does not seem to make much sense to me either :( >> >> My size results with -fno-devirtualize-speculatively is out. It >> shrinks size by 1.68% -- slightly more than -fdevirtualize can do in >> O2 compile. > > Hmm, this is interesting, too. 1.68% is definitly a lot more than I would > expect or have I seen on other testcases. You can take a look at summaries in > -fdump-ipa-devirt pass. > > In opts.c I have: > case OPT_fprofile_use: > if (!opts_set->x_flag_branch_probabilities) > ... > /* Indirect call profiling should do all useful transformations > speculative devirtualization does. */ > if (!opts_set->x_flag_devirtualize_speculatively > && opts->x_flag_value_profile_transformations) > opts->x_flag_devirtualize_speculatively = false; > > so perhaps this hunk is somehow skipped with LIPO?
The 1.68% size reduction I measured is with plain O2 compilation, not LIPO. > > Speculative devirtualization is somehwat less useful (may have more falce > positives) without LTO depending on how your headers are constructed. > It would be interesting to see if it does a lot of mistakes on your codebase. > (this can be easily done by forcing it to run with profile feedback, too and > it will tell you when its speculation differs from speculation already there). >> >> By the way, you mentioned 'hacking the >> ipa.c:walk_polymorphic_call_targets to not make the possible targets >> as >> reachable' -- is that something worth doing in trunk? With that, we >> can probably just turn off speculative devirtualization. > > Well, the check is there to enable inlining. Disabling it for > -fprofile-generate will result in lost profile samples for virtual functions. > Disabling it by default will prevent inlining of devirtualized calls making > devirtualization not really useful. > Perhaps with LIPO situation is bit different because you bring in the other > module just to inline the call as you describe. What I meant is whether it is suitable for plain build ? I have not looked at the details. > > One thing I can imagine doing is to make inliner consider the reachable > (in post-inlining sense, that is after removing extern inlines and virtual > functions) calls with priority and account only those to unit growth model. > This would make it more consistent over -fdevirtualize and more realistic > about resulting code size. > > I sort of considered this option but did not have any good data suggesting > I should implement it. > > In general it would be nice to understand this problem. Also I plan to do > some retunning for 5.0 so it would be nice to know if you have other issues > with 4.9? (I did not closely followed Google branch changes, so if you can > point out those that are relevant for IPA tuning, I would be very interested > to see what problems you hit). Ok I will collect a list of inlining related changes and let you know latter. thanks, David > > Honza >> >> David >> >> >> >> > >> > Honza >> >> >> >> David >> >> >> >> >> >> > >> >> > Honza >> >> >> >> >> >> David >> >> >> >> >> >> >> >> >> On Sat, Oct 18, 2014 at 10:10 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >> >> >> >> Disabling devirtualization reduces code size, both for >> >> >> >> instrumentation (because >> >> >> >> many more virtual functions are kept longer and therefore >> >> >> >> instrumented) and for >> >> >> >> normal optimization. >> >> >> > >> >> >> > OK, with profile instrumentation (that you seem to try to minimize) >> >> >> > i can see >> >> >> > how you get noticeably more counters because virtual functions are >> >> >> > kept longer. >> >> >> > (note that 4.9 is a lot more agressive on removing unreacable >> >> >> > virtual functions >> >> >> > than earlier compilers). >> >> >> > >> >> >> > Instead of disabling -fdevirtualize completely (that will get you >> >> >> > more indirect >> >> >> > calls and thus more topn profiling) you may consider just hacking >> >> >> > ipa.c:walk_polymorphic_call_targets to not make the possible targets >> >> >> > as >> >> >> > reachable. (see the conditional on before_inlining_p). >> >> >> > >> >> >> > Of course this will get you less devirtualization (but with LTO the >> >> >> > difference >> >> >> > should not be big - perhaps I could make switch for that for >> >> >> > mainline) and less >> >> >> > accurate profiles when you get speculative devirtualization via topn. >> >> >> > >> >> >> > I would be very interested to see how much difference this makes. >> >> >> > >> >> >> > Honza