On Fri, Apr 18, 2014 at 10:26 AM, Jan Hubicka <hubi...@ucw.cz> wrote: > Hello, >> Honza, >> Seeing your recent patches relating to inliner heuristics for LTO, >> I thought I should mention some related work I'm doing. >> >> By way of introduction, I've recently joined the IBM LTC's PPC >> Toolchain team, working on gcc performance. >> >> We have not generally seen good results using LTO on IBM power >> processors and one of the problems seems to be excessive inlining >> that results in the generation of excessive spill code. So, I have >> set out to tackle this by doing some analysis at the time of the >> inliner pass to compute something analogous to register pressure, >> which is then used to shut down inlining of routines that have a lot >> of pressure. > > This is intresting. I sort of planned to add register pressure logic > but always tought it is somewhat hard to do at GIMPLE level in a way > that would work for all CPUs. >> >> The analysis is basically a liveness analysis on the SSA names per >> basic block and looking for the maximum number live in any block. >> I've been using "liveness pressure" as a shorthand name for this. > > I believe this is usually called width >> >> This can then be used in two ways. >> 1) want_inline_function_to_all_callers_p at present always says to >> inline things that have only one call site without regard to size or >> what this may do to the register allocator downstream. In >> particular, BZ2_decompress in bzip2 gets inlined and this causes the >> pressure reported downstream for the int register class to increase >> 10x. Looking at some combination of pressure in caller/callee may >> help avoid this kind of situation. >> 2) I also want to experiment with adding the liveness pressure in >> the callee into the badness calculation in edge_badness used by >> inline_small_functions. The idea here is to try to inline functions >> that are less likely to cause register allocator difficulty >> downstream first. > > Sounds interesting. I am very curious if you can get consistent improvements > with this. I only implemented logic for large stack frames, but in C++ code > it seems often to do more harm than good. > > If you find examples of bad inlining, can you also fill it into bugzilla? > Perhaps the individual cases could be handled better by improving IRA.
yes -- I think this is the right time to do regardless. David > > Honza >> >> I am just at the point of getting a prototype working, I will get a >> patch you could take a look at posted next week. In the meantime, do >> you have any comments or feedback? >> >> Thanks, >> Aaron >> >> -- >> Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com >> 050-2/C113 (507) 253-7520 home: 507/263-0782 >> IBM Linux Technology Center - PPC Toolchain