================ @@ -210,6 +210,88 @@ Host Code Compilation - These relocatable objects are then linked together. - Host code within a TU can call host functions and launch kernels from another TU. +HIP Fat Binary Registration and Unregistration +============================================== + +When compiling HIP for AMD GPUs, Clang embeds device code into HIP "fat +binaries" and generates host-side helper functions that register these +fat binaries with the HIP runtime at program start and unregister them at +program exit. In non-RDC mode (``-fno-gpu-rdc``), each compilation unit +typically produces its own self-contained fat binary per GPU architecture. In +RDC mode (``-fgpu-rdc``), device bitcode from multiple compilation units may be +linked together into a single fat binary per GPU architecture. + +At the LLVM IR level, Clang/LLVM typically create an internal module +constructor (for example ``__hip_module_ctor`` or a ``.hip.fatbin_reg`` +function) and add it to ``@llvm.global_ctors``. This constructor is called by +the C runtime before ``main`` and it: + +* calls ``__hipRegisterFatBinary`` with a pointer to an internal wrapper + object that describes the HIP fat binary; +* stores the returned handle in an internal global variable; +* calls an internal helper such as ``__hip_register_globals`` to register + kernels, device variables and other metadata associated with the fat binary; +* registers a corresponding module destructor with ``atexit`` so it will run + during program termination. ---------------- yxsamliu wrote:
will add a brief summary of the atexit handler here https://github.com/llvm/llvm-project/pull/168566 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
