jhuber6 added a comment. In D128914#3643451 <https://reviews.llvm.org/D128914#3643451>, @yaxunl wrote:
> If you only unregister fatbin once for the whole program, then it should be > safe -fgpu-rdc. I am not sure if that is the case. it should be here, the generated handle is private to the registration module we created We only make one and it's impossible for anyone else to touch it even if mixing rdc with non-rdc codes. > My experience with -fgpu-rdc is that it causes much longer linking time for > large applications like PyTorch or TensroFlow, and LTO does not help. This is > because the compiler has lots of inter-procedural optimization passes which > take more than linear time. Due to that those apps need to be compiled as > -fno-gpu-rdc. Actually most CUDA/HIP applications are using -fno-gpu-rdc. Yes, it's actually pretty difficult to find a CUDA application using `fgpu-rdc`. It seems much more common to just stick everything that's needed in the file.I've considered finding a CUDA / HIP benchmark suite and comparing compile times using the new driver stuff. The benefit of having `fgpu-rdc` be the default is that device code basically behaves exactly like host code and LTO makes `fgpu-rdc` behave like `fno-gpu-rdc` performance wise. The downside, as you mentioned, is compile time. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D128914/new/ https://reviews.llvm.org/D128914 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits