[clang] [llvm] [NVPTX] Add errors for incorrect CUDA addrpaces (PR #138706)

Artem Belevich via cfe-commits Tue, 13 May 2025 12:44:26 -0700

================
@@ -1399,19 +1399,27 @@ void NVPTXAsmPrinter::emitFunctionParamList(const 
Function *F, raw_ostream &O) {
       if (PTy) {
         O << "\t.param .u" << PTySizeInBits << " .ptr";
 
+        bool IsCUDA = static_cast<NVPTXTargetMachine &>(TM).getDrvInterface() 
==
+                      NVPTX::CUDA;
         switch (PTy->getAddressSpace()) {
         default:
           break;
         case ADDRESS_SPACE_GLOBAL:
           O << " .global";
           break;
         case ADDRESS_SPACE_SHARED:
+          if (IsCUDA)
+            report_fatal_error(".shared ptr kernel args unsupported in CUDA.");
           O << " .shared";
           break;
         case ADDRESS_SPACE_CONST:
+          if (IsCUDA)
+            report_fatal_error(".const ptr kernel args unsupported in CUDA.");
----------------
Artem-B wrote:


> shows that even the clang frontend with a -cuda triple can still generate 
> invalid PTX which will crash when executed in a CUDA context.

When users use `__attribute__((address_space(5)))` incorrectly/inappropriately, 
it's on them. There are infinitely many ways to write compileable, but 
nonfunctional code, and CUDA is not an exception.

I'd like  to understand what exactly the problem is with those wrong-AS 
pointers.
Is it the argument declaration as a the pointer  in non-generic/global AS that 
causes the runtime failure?
Or is that dereferencing of that pointer?

If it's the dereferencing that causes trouble, then there may be an argument 
that users should be allowed to pass around wrong-AS pointer arguments without 
dereferencing, if they want to. Sort of like C++ allows passing around 
`nullptr` -- we do know that we can't dereference it, but passing around the 
pointer value itself is fine. It's just a bunch of bits.

> PTX is technically legal according to the spec, but is invalid when used 
> within the CUDA API [...]. We plan on adding public documentation about this 
> soon though.

Awesome. If it is something that's going to be part of PTX spec, then enforcing 
it in LLVM may be OK, but it would be great to see the details first. 

There's also a question of *where* is the right place to enforce this 
restriction. Asm printer is understandably convenient, but it's far too late in 
the compilation pipeline and is removed from the user-supplied IR. Ideally we 
want to associate the error with specific part of user input (and LLVM is not 
great at diagnostics to start with).

I wonder if it should be done somewhere in the input IR validation where we can 
still dump the problematic IR.



https://github.com/llvm/llvm-project/pull/138706
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add errors for incorrect CUDA addrpaces (PR #138706)

Reply via email to