That is sort of part of a disagreement I've been having with someone who uses
this feature. He feels this is supported behavior and not a program with
errors. So if there is code like:
void bar() {}
__host__ __device__ foo() { bar(); }
And if foo is never called from device then the program "makes sense", as you
are never attempting to have host code executed on the GPU, and the compiled
program runs as expected. Now this is a silly example, he is doing some
template metaprogramming to generate kernels for both host and device which
makes his use-case understandable. Using this patch the code we generate also
runs correctly. So it isn't just for analysis as it useful in our code
generation too.
Normally I would think this could be fixed by ifdef-guarding on __CUDA_ARCH__
but if bar were to perform a templated kernel launch, which happens in this
client's code, then that would not be allowed usage under nvcc.
http://reviews.llvm.org/D7841
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits