================
@@ -548,6 +551,12 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, 
const ArgList &Args,
   if (!Triple.isNVPTX() && !Triple.isSPIRV())
     CmdArgs.push_back("-Wl,--no-undefined");
 
+  // The device inputs are bitcode stored in files with an object extension.
+  // Force the IR input language so Clang runs the compile and backend phases
+  // instead of treating them as linker inputs, which would defer codegen to
+  // the LTO link and defeat the non-LTO pipeline.
+  if (NonLTOAMDGPU)
+    CmdArgs.append({"-x", "ir"});
----------------
yxsamliu wrote:

Good point on PGO. The profile runtime isn't `-mlink`'d, so I now keep LTO when 
`-fprofile-generate` is set — only plain non-RDC takes the non-LTO path, so 
profile generation still links and optimizes the runtime as before. This does 
highlight the real gap you mentioned: non-RDC non-LTO can't link device-side 
compiler-rt libraries properly, which is part of why the unified RDC/non-RDC 
interface in the FIXME would help.


https://github.com/llvm/llvm-project/pull/201135
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to