tejohnson added a comment. In D99683#2672578 <https://reviews.llvm.org/D99683#2672578>, @yaxunl wrote:
> In D99683#2672554 <https://reviews.llvm.org/D99683#2672554>, @tejohnson wrote: > >> This raises some higher level questions for me: >> >> First, how will you deal with other corner cases that won't or cannot be >> imported right now? While enabling importing of noinline functions and >> cranking up the threshold will get the majority of functions imported, there >> are cases that we still won't import (functions/vars that are interposable, >> certain funcs/vars that cannot be renamed, most non-const variables with >> non-trivial initializers). > > We will document the limitation of thinLTO support of HIP toolchain and > recommend users not to use thinLTO in those corner cases. > >> Second, force importing of everything transitively referenced defeats the >> purpose of ThinLTO and would probably make it worse than regular LTO. The >> main entry module will need to import everything transitively referenced >> from there, so everything not dead in the binary, which should make that >> module post importing equivalent to a regular LTO module. In addition, every >> other module needs to transitively import everything referenced from those >> modules, making them very large depending on how many leaf vs non-leaf >> functions and variables they contain. What is the goal of doing ThinLTO in >> this case? > > The objective is to improve optimization/codegen time by using multi-threads > of thinLTO. For example, I have 10 modules each containing a kernel. In full > LTO linking, I get one big module containing 10 kernels with all functions > inlined, and I have one thread for optimization/codegen. With thinLTO, I get > one kernel in each module, with all functions inlined. AMDGPU internalization > and global DCE will remove functions not used by that kernel in each module. > I will get 10 threads, each doing optimization/codegen for one kernel. > Theoretically, there could be 10 times speed up. That will work as long as there are no dependence edges anywhere between the kernels. Is this a library that has a bunch of totally independent kernels only called externally? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99683/new/ https://reviews.llvm.org/D99683 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits