That is sort of part of a disagreement I've been having with someone who uses 
this feature. He feels this is supported behavior and not a program with 
errors. So if there is code like:

  void bar() {}
  __host__ __device__ foo() { bar(); }

And if foo is never called from device then the program "makes sense", as you 
are never attempting to have host code executed on the GPU, and the compiled 
program runs as expected. Now this is a silly example, he is doing some 
template metaprogramming to generate kernels for both host and device which 
makes his use-case understandable. Using this patch the code we generate also 
runs correctly. So it isn't just for analysis as it useful in our code 
generation too.

Normally I would think this could be fixed by ifdef-guarding on __CUDA_ARCH__ 
but if bar were to perform a templated kernel launch, which happens in this 
client's code, then that would not be allowed usage under nvcc.


http://reviews.llvm.org/D7841

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to