ivanradanov wrote:
passing `-fopenmp --offload-arch=sm_80` above, so
```
clang -fopenmp --offload-arch=sm_80 --verbose -foffload-via-llvm
--cuda-path=/usr/local/cuda input.o -o a.out
```
Gives us the appropriate flags. That means the cuda toolchain was created,
correct?
I wonder if we need a step in clang that looks at all the .o files for sections
that need device linking and concats the archs, and reinvokes itself with
--offload-arch=<all_collected_arches> (although it is clang-linker-wrapper's
job to do the parsing of the .o files for that so kind of weird to have clang
do it) But then in theory the appropriate toolchains should be created. Perhaps
it can only kick in when -foffload-via-llvm is on, but no --offload-archs are
specified, i.e. we are asking clang to figure the appropriate offload archs.
That step could actually be handled by clang-offload-wrapper - you would get
```
clang --offload-via-llvm <args>
-> clang-linker-wrapper --detect-archs-and-exec=clang <args>
-> clang --offload-via-llvm --offload-archs=<detected_archs> <args>
-> clang-linker-wrapper (same as until now)
```
Pretty convoluted so I don't know if it's appropriate
https://github.com/llvm/llvm-project/pull/149107
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits