>> Since the header file and library seem not to affect this patch, is it OK to 
>> defer their changes to be part of the toolchain patch?
> I'm not sure I understand. Could you elaborate?

clang -cc1 does not include `__clang_cuda_runtime_wrapper.h` by default when 
clang -cc1 is called directly to compile CUDA programs. CUDA toolchain adds 
-include `__clang_cuda_runtime_wrapper.h` when compiling CUDA program as kernel 
code. Therefore if clang -cc1 is used to compile HIP program in lit test, there 
is no need to use `-fnocudainc`.

This patch mainly changes kernel launching API function names. The implement 
and testing of this change does not depend on the CUDA/HIP header files. A 
minimum header like test/CodeGenCUDA/Input/cuda.h is sufficient for testing 
this patch.

Basically this patch is only concerns about -cc1 and therefore is independent 
of the toolchain changes about header and library files.


