jhuber6 added a comment.

In D132248#3735793 <https://reviews.llvm.org/D132248#3735793>, @tra wrote:

>> The old driver would put all the outputs in the final action list akin to a 
>> linker job.
>
> IIRC that's where HIP and CUDA behaved differently. CUDA compilation does not 
> allow device-only compilation for multiple targets if we have explicitly 
> specified output. It does produce individual per-gpu .o files if compiled 
> without `-o`.
>
>   bin/clang++ --cuda-path=$HOME/local/cuda-11.7 --offload-arch=sm_80 
> --offload-arch=sm_86 -x cuda axpy.cu  --cuda-device-only -O3  -c -o axpy.o
>   clang-15: error: cannot specify -o when generating multiple output files

Is this an architectural limitation? I'd imagine they'd just behave the same 
way here in my implementation.



================
Comment at: clang/test/Driver/cuda-bindings.cu:160
+// MULTI-D-ONLY-NEXT: # "nvptx64-nvidia-cuda" - "clang", inputs: 
["[[INPUT]]"], output: "[[PTX_52:.+]]"
+// MULTI-D-ONLY-NEXT: # "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[PTX_52]]"], output: "[[CUBIN_52:.+]]"
----------------
tra wrote:
> If we've specified `-o foo.o`, where do those multiple outputs go to?
> 
> The old driver disallowed using `-o` when compiling for multiple GPUs.
Good catch, right now it'll just write both of them to the same file.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132248/new/

https://reviews.llvm.org/D132248

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to