[llvm-bugs] [Bug 56063] [CUDA] Performance regression in CUDA Clang for the RSBench mini-app

LLVM Bugs via llvm-bugs Thu, 16 Jun 2022 07:46:50 -0700

Issue	56063
Summary	[CUDA] Performance regression in CUDA Clang for the RSBench mini-app
Labels	cuda, performance
Assignees
Reporter	jhuber6

    The [RSBench](https//github.com/ANL-CESAR/RSBench.git) mini-application experienced a performance regression when targeting CUDA on my V100 with CUDA 11.6.2. Previously, Clang's performance was roughly on-par with NVCC's with an execution time of about 2.1 seconds on my machine. Following the application of 0af3e6a22da2eda5021b5fad656d0b9db7702e0a the performance has regressed roughly 33% to about 3.1 seconds. Reverting this commit locally gets back the original performance and matches NVCC. This was produced using the following commands. I can provide the IR differences later.


```
$ cd cuda/
$ clang++  --offload-arch=sm_70 -O3 -c main.cu -o main.o
$ clang++  --offload-arch=sm_70 -O3 -c simulation.cu -o simulation.o
$ clang++  --offload-arch=sm_70 -O3 -c io.cu -o io.o
$ clang++  --offload-arch=sm_70 -O3 -c init.cu -o init.o
$ clang++  --offload-arch=sm_70 -O3 -c material.cu -o material.o
$ clang++  --offload-arch=sm_70 -O3 -c utils.cu -o utils.o
$ clang++  --offload-arch=sm_70 -O3 main.o simulation.o io.o init.o material.o utils.o -o rsbench -lm -lcudart
$ nvprof ./rsbench -m event
```

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 56063] [CUDA] Performance regression in CUDA Clang for the RSBench mini-app

Reply via email to