================ @@ -1399,19 +1399,27 @@ void NVPTXAsmPrinter::emitFunctionParamList(const Function *F, raw_ostream &O) { if (PTy) { O << "\t.param .u" << PTySizeInBits << " .ptr"; + bool IsCUDA = static_cast<NVPTXTargetMachine &>(TM).getDrvInterface() == + NVPTX::CUDA; switch (PTy->getAddressSpace()) { default: break; case ADDRESS_SPACE_GLOBAL: O << " .global"; break; case ADDRESS_SPACE_SHARED: + if (IsCUDA) + report_fatal_error(".shared ptr kernel args unsupported in CUDA."); O << " .shared"; break; case ADDRESS_SPACE_CONST: + if (IsCUDA) + report_fatal_error(".const ptr kernel args unsupported in CUDA."); ---------------- Artem-B wrote:
> shows that even the clang frontend with a -cuda triple can still generate > invalid PTX which will crash when executed in a CUDA context. When users use `__attribute__((address_space(5)))` incorrectly/inappropriately, it's on them. There are infinitely many ways to write compileable, but nonfunctional code, and CUDA is not an exception. I'd like to understand what exactly the problem is with those wrong-AS pointers. Is it the argument declaration as a the pointer in non-generic/global AS that causes the runtime failure? Or is that dereferencing of that pointer? If it's the dereferencing that causes trouble, then there may be an argument that users should be allowed to pass around wrong-AS pointer arguments without dereferencing, if they want to. Sort of like C++ allows passing around `nullptr` -- we do know that we can't dereference it, but passing around the pointer value itself is fine. It's just a bunch of bits. > PTX is technically legal according to the spec, but is invalid when used > within the CUDA API [...]. We plan on adding public documentation about this > soon though. Awesome. If it is something that's going to be part of PTX spec, then enforcing it in LLVM may be OK, but it would be great to see the details first. There's also a question of *where* is the right place to enforce this restriction. Asm printer is understandably convenient, but it's far too late in the compilation pipeline and is removed from the user-supplied IR. Ideally we want to associate the error with specific part of user input (and LLVM is not great at diagnostics to start with). I wonder if it should be done somewhere in the input IR validation where we can still dump the problematic IR. https://github.com/llvm/llvm-project/pull/138706 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits