tra added a comment. With the updated patch description + the discussion I'm OK with the approach from the general "how do we compile/use CUDA" point of view. I'll leave the question of whether the approach works for OpenMP to someone more familiar with it.
While I'm not completely convinced that [fatbin]->.c->[clang]->.o (with device code only)->[ld -r] -> host.o (host+device code) is ideal (things could be done with smaller number of tool invocations), it should help to deal with -rdc compilation until we get a chance to improve support for it in Clang. We may revisit and change this portion of the pipeline when clang can incorporate -rdc GPU binaries in a way compatible with CUDA tools. Repository: rC Clang https://reviews.llvm.org/D47394 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits