[PATCH] D112053: [cuda] Add address space predicate funuctions.

Artem Belevich via Phabricator via cfe-commits Tue, 19 Oct 2021 13:35:27 -0700

tra added inline comments.


================
Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:691-694
+BUILTIN(__nvvm_isspacep_const, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_global, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_local, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_shared, "bvC*", "nc")
----------------
hliao wrote:
> hliao wrote:
> > tra wrote:
> > > CUDA appears to be using `__nv_isGlobal_impl` for the AS predicates. 
> > > Perhaps we want to add those, too, forwarding them to the `__nvvm_...` 
> > > implementations above. I've already added a few other AS-related `__nv_*` 
> > > builtins in `lib/Headers/__clang_cuda_intrinsics.h`.
> > `__nv_isGlobal_impl` is exposed as an official interface. In fact, in CUDA 
> > SDK 10.0 or earlier, `__isGlobal` is directly implemented as inline asm. If 
> > possible, we should avoid defining unofficial or undocumented interfaces. 
> > `__nv_isGlobal_impl` was introduced from CUDA SDK 10.1 but there is no 
> > documentation on it.
> > 
> >   // This function returns 1 if generic address "ptr" is in global memory 
> > space.
> >   // It returns 0 if "ptr" is in shared, local or constant memory space.
> >   __SM_20_INTRINSICS_DECL__ unsigned int __isGlobal(const void *ptr)
> >   {
> >     unsigned int ret;
> >     asm volatile ("{ \n\t"
> >                   "    .reg .pred p; \n\t"
> >                   "    isspacep.global p, %1; \n\t"
> >                   "    selp.u32 %0, 1, 0, p;  \n\t"
> >   #if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__) || 
> > defined(__CUDACC_RTC__)
> >                   "} \n\t" : "=r"(ret) : "l"(ptr));
> >   #else
> >                   "} \n\t" : "=r"(ret) : "r"(ptr));
> >   #endif
> > 
> >     return ret;
> >   }
> > 
> typo, `__nv_isGlobal_impl` is *not* exposed as an official interface.
> 
> > `__nv_isGlobal_impl` is exposed as an official interface. In fact, in CUDA 
> > SDK 10.0 or earlier, `__isGlobal` is directly implemented as inline asm. If 
> > possible, we should avoid defining unofficial or undocumented interfaces. 
> > `__nv_isGlobal_impl` was introduced from CUDA SDK 10.1 but there is no 
> > documentation on it.
> > 
> >   // This function returns 1 if generic address "ptr" is in global memory 
> > space.
> >   // It returns 0 if "ptr" is in shared, local or constant memory space.
> >   __SM_20_INTRINSICS_DECL__ unsigned int __isGlobal(const void *ptr)
> >   {
> >     unsigned int ret;
> >     asm volatile ("{ \n\t"
> >                   "    .reg .pred p; \n\t"
> >                   "    isspacep.global p, %1; \n\t"
> >                   "    selp.u32 %0, 1, 0, p;  \n\t"
> >   #if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__) || 
> > defined(__CUDACC_RTC__)
> >                   "} \n\t" : "=r"(ret) : "l"(ptr));
> >   #else
> >                   "} \n\t" : "=r"(ret) : "r"(ptr));
> >   #endif
> > 
> >     return ret;
> >   }
> > 
> 
> 
> `__nv_isGlobal_impl` is *not* exposed as an official interface.
> If possible, we should avoid defining unofficial or undocumented interfaces. 
> __nv_isGlobal_impl was introduced from CUDA SDK 10.1 but there is no 
> documentation on it.

In general, I agree. I just wish NVIDIA would stop using undocumented APIs in 
the public headers they ship. By necessity, clang either has to rely on 
preprocessor hacks to edit out uncompileable code, or guess what the 
undocumented APIs do and implement them.

In this case I do not think `__nv_is*_impl`  are used anywhere other than in 
the functions you have renamed, so we're fine without them. They would be 
needed if the patch didn't have to rename `__isGlobal` and friends.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112053/new/

https://reviews.llvm.org/D112053

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D112053: [cuda] Add address space predicate funuctions.

Reply via email to