https://bugs.llvm.org/show_bug.cgi?id=41597

            Bug ID: 41597
           Summary: Certain CUDA codes produce "invalid device function" -
                    appears to be fixed in trunk
           Product: clang
           Version: 8.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: -New Bugs
          Assignee: unassignedclangb...@nondot.org
          Reporter: philip.salzm...@uibk.ac.at
                CC: htmldevelo...@gmail.com, llvm-bugs@lists.llvm.org,
                    neeil...@live.com, richard-l...@metafoo.co.uk

For certain CUDA codes Clang 8 will produce an executable that causes the error

> cudaErrorInvalidDeviceFunction (error 8) due to "invalid device function" on 
> CUDA API call to cudaLaunch.

It's difficult to pinpoint exactly what conditions cause this behavior, as it
strikes seemingly "at random". I've encountered this in the context of hipSYCL,
a SYCL implementation based on HIP (however the code is being compiled as CUDA
in this case). Here you can find the original issue with a smallish demo:
https://github.com/illuhad/hipSYCL/issues/49. Of course this unfortunately
includes all of the added complexity surrounding hipSYCL.

This has so far been reproduced using Clang 8 and CUDA 9.2 on Arch, as well as
using Clang 8 and CUDA 10.0 on Ubuntu.

I'd like to provide a self-contained demo, however compiling the test case with
`-save-temps` produces two files, one `*-cuda-nvptx64-nvidia-cuda-sm_52.cui`
and another `-host-x86_64-pc-linux-gnu.cui`. These files are rather large, and
I'm not sure if there is a way of feeding them both back into Clang to be able
to reduce their size with delta. Any advice is welcome.

The good news: The issue appears to have been fixed in trunk, and I've narrowed
the fix down to https://reviews.llvm.org/D58163. That being said, the assertion
at the top of `CGNVCUDARuntime::emitDeviceStub` still fails (i.e., the demo
works when compiled with a release build), which is not surprising as the
commit was meant to address something seemingly unrelated.

The assertion in question is

> llvm-project/clang/lib/CodeGen/CGCUDANV.cpp:228: virtual void 
> {anonymous}::CGNVCUDARuntime::emitDeviceStub(clang::CodeGen::CodeGenFunction&,
>  clang::CodeGen::FunctionArgList&): Assertion 
> `getDeviceSideName(CGF.CurFuncDecl) == CGF.CurFn->getName() || 
> getDeviceSideName(CGF.CurFuncDecl) + ".stub" == CGF.CurFn->getName() || 
> CGF.CGM.getContext().getTargetInfo().getCXXABI() != 
> CGF.CGM.getContext().getAuxTargetInfo()->getCXXABI()' failed.

and it fails in the degenerate case (see GitHub issue) because the mangled
names differ right after the $:

> _ZN2cl4sycl6detail8dispatch19parallel_for_kernelILi1EZZ4mainENK3$_0clERNS0_7handlerEEUlNS0_4itemILi1ELb1EEEE_EEvT0_NS0_5rangeIXT_EEE
> _ZN2cl4sycl6detail8dispatch19parallel_for_kernelILi1EZZ4mainENK3$_1clERNS0_7handlerEEUlNS0_4itemILi1ELb1EEEE_EEvT0_NS0_5rangeIXT_EEE

whereas for the working example they are the same.

Please let me know if there are any additional steps I can take to provide more
context.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to