jhuber6 added a comment.

In D128914#3643451 <https://reviews.llvm.org/D128914#3643451>, @yaxunl wrote:

> If you only unregister fatbin once for the whole program, then it should be 
> safe -fgpu-rdc. I am not sure if that is the case.

it should be here, the generated handle is private to the registration module 
we created We only make one and it's impossible for anyone else to touch it 
even if mixing rdc with non-rdc codes.

> My experience with -fgpu-rdc is that it causes much longer linking time for 
> large applications like PyTorch or TensroFlow, and LTO does not help. This is 
> because the compiler has lots of inter-procedural optimization passes which 
> take more than linear time. Due to that those apps need to be compiled as 
> -fno-gpu-rdc. Actually most CUDA/HIP applications are using -fno-gpu-rdc.

Yes, it's actually pretty difficult to find a CUDA application using 
`fgpu-rdc`. It seems much more common to just stick everything that's needed in 
the file.I've considered finding a CUDA / HIP benchmark suite and comparing 
compile times using the new driver stuff. The benefit of having `fgpu-rdc` be 
the default is that device code basically behaves exactly like host code and 
LTO makes `fgpu-rdc` behave like `fno-gpu-rdc` performance wise. The downside, 
as you mentioned, is compile time.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128914/new/

https://reviews.llvm.org/D128914

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to