https://bugs.llvm.org/show_bug.cgi?id=41597
Bug ID: 41597
Summary: Certain CUDA codes produce "invalid device function" -
appears to be fixed in trunk
Product: clang
Version: 8.0
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: -New Bugs
Assignee: unassignedclangb...@nondot.org
Reporter: philip.salzm...@uibk.ac.at
CC: htmldevelo...@gmail.com, llvm-bugs@lists.llvm.org,
neeil...@live.com, richard-l...@metafoo.co.uk
For certain CUDA codes Clang 8 will produce an executable that causes the error
> cudaErrorInvalidDeviceFunction (error 8) due to "invalid device function" on
> CUDA API call to cudaLaunch.
It's difficult to pinpoint exactly what conditions cause this behavior, as it
strikes seemingly "at random". I've encountered this in the context of hipSYCL,
a SYCL implementation based on HIP (however the code is being compiled as CUDA
in this case). Here you can find the original issue with a smallish demo:
https://github.com/illuhad/hipSYCL/issues/49. Of course this unfortunately
includes all of the added complexity surrounding hipSYCL.
This has so far been reproduced using Clang 8 and CUDA 9.2 on Arch, as well as
using Clang 8 and CUDA 10.0 on Ubuntu.
I'd like to provide a self-contained demo, however compiling the test case with
`-save-temps` produces two files, one `*-cuda-nvptx64-nvidia-cuda-sm_52.cui`
and another `-host-x86_64-pc-linux-gnu.cui`. These files are rather large, and
I'm not sure if there is a way of feeding them both back into Clang to be able
to reduce their size with delta. Any advice is welcome.
The good news: The issue appears to have been fixed in trunk, and I've narrowed
the fix down to https://reviews.llvm.org/D58163. That being said, the assertion
at the top of `CGNVCUDARuntime::emitDeviceStub` still fails (i.e., the demo
works when compiled with a release build), which is not surprising as the
commit was meant to address something seemingly unrelated.
The assertion in question is
> llvm-project/clang/lib/CodeGen/CGCUDANV.cpp:228: virtual void
> {anonymous}::CGNVCUDARuntime::emitDeviceStub(clang::CodeGen::CodeGenFunction&,
> clang::CodeGen::FunctionArgList&): Assertion
> `getDeviceSideName(CGF.CurFuncDecl) == CGF.CurFn->getName() ||
> getDeviceSideName(CGF.CurFuncDecl) + ".stub" == CGF.CurFn->getName() ||
> CGF.CGM.getContext().getTargetInfo().getCXXABI() !=
> CGF.CGM.getContext().getAuxTargetInfo()->getCXXABI()' failed.
and it fails in the degenerate case (see GitHub issue) because the mangled
names differ right after the $:
> _ZN2cl4sycl6detail8dispatch19parallel_for_kernelILi1EZZ4mainENK3$_0clERNS0_7handlerEEUlNS0_4itemILi1ELb1EEEE_EEvT0_NS0_5rangeIXT_EEE
> _ZN2cl4sycl6detail8dispatch19parallel_for_kernelILi1EZZ4mainENK3$_1clERNS0_7handlerEEUlNS0_4itemILi1ELb1EEEE_EEvT0_NS0_5rangeIXT_EEE
whereas for the working example they are the same.
Please let me know if there are any additional steps I can take to provide more
context.
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs