> > Thanks for your comments! > > I'll continue my experiments with with my initial early_local_passes > splitting. Will just put there original functions bodies release to > avoid overhead for their useless optimizations. So, it will be 3 IPA > passes: > > 1. SSA build > 2. Make instrumented versions + release bodies of original functions > 3. early_inline + early optimizations. > > Will be back with more results. >
Hi, I've implemented the chosen approach and performed a set of experiments. Without LTO it mostly worked fine. Some new reachability analysis issues were met and I finally solved them by introducing a new type of IPA reference between the original function and it's instrumented version. Two significant problems were revealed during LTO tests. The first one was with assembler function names. With no LTO I just used the same assembler name for both original and instrumented function versions. With LTO assembler name becomes much more important identifier and it does not seems possible to share the same name between original and instrumented versions. I started to use suffix for instrumented function names and chained it with the original assembler name (used IDENTIFIER_TRANSPARENT_ALIAS flag, unfortunately not in all cases aliases chain is followed, so I had to additionally fix few places to have original names printed for the instrumented function decls). BTW LTO streamer does not preserve transparent aliases chain for identifiers. Is it intentional? The second problem was resolutions file from linker. Linker has no idea about connection between instrumented and not instrumented functions and therefore may declare instrumented functions as local when external calls to the original exist. It caused more reachability analysis issues (instrumented functions were removed) and visibility issue (instrumented functions were not marked as global). Visibility issue was simply solved by looking at original function decl before globalizing decl. In solving reachability problems I found useful to transform original functions to the special kind of thunks having call edge to the instrumented version. It guarantees we do not remove instrumented version when original is called. It also provides optimization opportunity for not instrumented calls (e.g. IPCP goes through thunks and may propagate constant params; will require additional support to map params correctly though). If my choices look reasonable I would continue with code cleanup, merge and preparing a patch series for stage 1. Thanks, Ilya
