[PATCH] D132248: [CUDA][OpenMP] Fix the new driver crashing on multiple device-only outputs

Joseph Huber via Phabricator via cfe-commits Fri, 19 Aug 2022 08:57:23 -0700

jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, tra, yaxunl, JonChesterfield.
Herald added subscribers: mattd, guansong.
Herald added a project: All.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1, MaskRay.
Herald added a project: clang.


The new driver supports device-only compilation for the offloading
device. The way this is handlded is a little different from the old
offloading driver. The old driver would put all the outputs in the final
action list akin to a linker job. The new driver however generated these
in the middle of the host's job so we instead put them all in a single
offloading action. However, we only handled these kinds of offloading
actions correctly when there was only a single input. When we had
multiple inputs we would instead attempt to get the host job, which
didn't exist, and crash.

This patch simply adds some extra logic to generate the jobs for all
dependencies if there is not host action.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D132248

Files:
  clang/lib/Driver/Driver.cpp
  clang/test/Driver/cuda-bindings.cu


Index: clang/test/Driver/cuda-bindings.cu
===================================================================
--- clang/test/Driver/cuda-bindings.cu
+++ clang/test/Driver/cuda-bindings.cu
@@ -146,3 +146,15 @@
 // RUN:        --cuda-gpu-arch=sm_52 --cuda-device-only -c -o foo.o %s 2>&1 \
 // RUN: | FileCheck -check-prefix=D_ONLY %s
 // D_ONLY: "foo.o"
+
+//
+// Check to make sure we can generate multiple outputs for device-only
+// compilation.
+//
+// RUN: %clang -### -target powerpc64le-ibm-linux-gnu -ccc-print-bindings \
+// RUN:        --offload-arch=sm_70 --offload-arch=sm_52 --offload-device-only 
-c -o foo.o %s 2>&1 \
+// RUN: | FileCheck -check-prefix=MULTI-D-ONLY %s
+//      MULTI-D-ONLY: # "nvptx64-nvidia-cuda" - "clang", inputs: 
["[[INPUT:.+]]"], output: "[[PTX_70:.+]]"
+// MULTI-D-ONLY-NEXT: # "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[PTX_70]]"], output: "[[CUBIN_70:.+]]"
+// MULTI-D-ONLY-NEXT: # "nvptx64-nvidia-cuda" - "clang", inputs: 
["[[INPUT]]"], output: "[[PTX_52:.+]]"
+// MULTI-D-ONLY-NEXT: # "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[PTX_52]]"], output: "[[CUBIN_52:.+]]"
Index: clang/lib/Driver/Driver.cpp
===================================================================
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -5261,20 +5261,21 @@
     //                      \
     //    Device Action 1  ---> OffloadAction -> Device Action 2
     //
-    // For a) and b), we just return the job generated for the dependence. For
+    // For a) and b), we just return the job generated for the dependences. For
     // c) and d) we override the current action with the host/device dependence
     // if the current toolchain is host/device and set the offload dependences
     // info with the jobs obtained from the device/host dependence(s).
 
-    // If there is a single device option, just generate the job for it.
-    if (OA->hasSingleDeviceDependence()) {
+    // If there is a single device option or has no host action, just generate
+    // the job for it.
+    if (OA->hasSingleDeviceDependence() || !OA->hasHostDependence()) {
       InputInfoList DevA;
       OA->doOnEachDeviceDependence([&](Action *DepA, const ToolChain *DepTC,
                                        const char *DepBoundArch) {
-        DevA =
-            BuildJobsForAction(C, DepA, DepTC, DepBoundArch, AtTopLevel,
-                               /*MultipleArchs*/ !!DepBoundArch, LinkingOutput,
-                               CachedResults, DepA->getOffloadingDeviceKind());
+        DevA.append(BuildJobsForAction(C, DepA, DepTC, DepBoundArch, 
AtTopLevel,
+                                       /*MultipleArchs*/ !!DepBoundArch,
+                                       LinkingOutput, CachedResults,
+                                       DepA->getOffloadingDeviceKind()));
       });
       return DevA;
     }

Index: clang/test/Driver/cuda-bindings.cu
===================================================================
--- clang/test/Driver/cuda-bindings.cu
+++ clang/test/Driver/cuda-bindings.cu
@@ -146,3 +146,15 @@
 // RUN:        --cuda-gpu-arch=sm_52 --cuda-device-only -c -o foo.o %s 2>&1 \
 // RUN: | FileCheck -check-prefix=D_ONLY %s
 // D_ONLY: "foo.o"
+
+//
+// Check to make sure we can generate multiple outputs for device-only
+// compilation.
+//
+// RUN: %clang -### -target powerpc64le-ibm-linux-gnu -ccc-print-bindings \
+// RUN:        --offload-arch=sm_70 --offload-arch=sm_52 --offload-device-only -c -o foo.o %s 2>&1 \
+// RUN: | FileCheck -check-prefix=MULTI-D-ONLY %s
+//      MULTI-D-ONLY: # "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[PTX_70:.+]]"
+// MULTI-D-ONLY-NEXT: # "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[PTX_70]]"], output: "[[CUBIN_70:.+]]"
+// MULTI-D-ONLY-NEXT: # "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]"], output: "[[PTX_52:.+]]"
+// MULTI-D-ONLY-NEXT: # "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[PTX_52]]"], output: "[[CUBIN_52:.+]]"
Index: clang/lib/Driver/Driver.cpp
===================================================================
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -5261,20 +5261,21 @@
     //                      \
     //    Device Action 1  ---> OffloadAction -> Device Action 2
     //
-    // For a) and b), we just return the job generated for the dependence. For
+    // For a) and b), we just return the job generated for the dependences. For
     // c) and d) we override the current action with the host/device dependence
     // if the current toolchain is host/device and set the offload dependences
     // info with the jobs obtained from the device/host dependence(s).
 
-    // If there is a single device option, just generate the job for it.
-    if (OA->hasSingleDeviceDependence()) {
+    // If there is a single device option or has no host action, just generate
+    // the job for it.
+    if (OA->hasSingleDeviceDependence() || !OA->hasHostDependence()) {
       InputInfoList DevA;
       OA->doOnEachDeviceDependence([&](Action *DepA, const ToolChain *DepTC,
                                        const char *DepBoundArch) {
-        DevA =
-            BuildJobsForAction(C, DepA, DepTC, DepBoundArch, AtTopLevel,
-                               /*MultipleArchs*/ !!DepBoundArch, LinkingOutput,
-                               CachedResults, DepA->getOffloadingDeviceKind());
+        DevA.append(BuildJobsForAction(C, DepA, DepTC, DepBoundArch, AtTopLevel,
+                                       /*MultipleArchs*/ !!DepBoundArch,
+                                       LinkingOutput, CachedResults,
+                                       DepA->getOffloadingDeviceKind()));
       });
       return DevA;
     }

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D132248: [CUDA][OpenMP] Fix the new driver crashing on multiple device-only outputs

Reply via email to