[PATCH] D94732: [CUDA] Normalize handling of defauled dtor.

Richard Smith - zygoloid via Phabricator via cfe-commits Wed, 20 Jan 2021 14:16:53 -0800

rsmith added inline comments.


================
Comment at: clang/lib/Sema/SemaDeclCXX.cpp:15162-15170
+  bool SkipDtorChecks = VD->getType()->isArrayType();
+
+  // CUDA: Skip destructor checks for host-only variables during device-side
+  // compilation
+  SkipDtorChecks |=
+      (LangOpts.CUDAIsDevice && VD->hasGlobalStorage() &&
+       !(VD->hasAttr<CUDADeviceAttr>() || VD->hasAttr<CUDAConstantAttr>() ||
----------------
tra wrote:
> rsmith wrote:
> > Is this safe? What happens if the destructor for the variable is a 
> > template, and instantiating that template results in a reference to a 
> > device function? Eg:
> > 
> > ```
> > template<typename T> __device__ void f() {}
> > template<typename T> struct A {
> >   ~A() { f<<<>>>(); }
> > };
> > A a;
> > ```
> This is business as usual -- we catch it during host compilation, where `a` 
> is instantiated.
> 
> ```
> h.cu:3:10: error: no matching function for call to 'f'
>   ~A() { f<T>(); }
>          ^~~~
> h.cu:5:8: note: in instantiation of member function 'A<int>::~A' requested 
> here
> A<int> a;
>        ^
> h.cu:1:51: note: candidate function not viable: call to __device__ function 
> from __host__ function
> template<typename T> __attribute__((device)) void f() {}
> 
> 1 error generated when compiling for host.
> ```
> 
> If it were a `__device__ A<int> a;` , then we catch it during GPU compilation 
> and also complain that we can't have dynamic initializers.
> 
Sorry, testcase wasn't quite right; I meant for `f` to be `__global__` not 
`__device__` so that the kernel call to it works. Fixed example:

```
extern "C" int cudaConfigureCall(int a, int b);
template<typename T> __attribute__((__global__)) void f(T) {}
template<typename T> struct A {
  ~A() { f<<<1, 1>>>(T()); }
};
A<int> a;
```

I think that this is valid. In order for it to work, we need to trigger 
instantiation of `f<int>` on the device side of the compilation. In order to do 
that, we need to trigger instantiation of `A<int>::~A()`, so we need to mark it 
referenced on the device side. (This is, I think, in line with the general 
principle that we want to do the same template instantiations of host functions 
on both sides of the compilation, so that both sides agree on which kernel 
functions are referenced.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94732/new/

https://reviews.llvm.org/D94732

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D94732: [CUDA] Normalize handling of defauled dtor.

Reply via email to