https://github.com/jhuber6 created 
https://github.com/llvm/llvm-project/pull/91984

Summary:
One of the downsides of the linker wrapper is that it made debugging
more difficult. It is very powerful in that it can resolve a lot of
input matching and library handling that could not be done before.
However, the old method allowed users to simply copy-paste the script
files to modify the output and test it.

This patch attempts to make it easier to debug changes by letting the
user override all the linker inputs. That is, we provide a user-created
binary that is treated like the final output of the device link step.
The intended use-case is for using `-save-temps` to get some IR, then
modifying the IR and sticking it back in to see if it exhibits the old
failures.


>From 4164203d3afbd2930a98e720ba29b6da1935626a Mon Sep 17 00:00:00 2001
From: Joseph Huber <hube...@outlook.com>
Date: Mon, 13 May 2024 10:53:55 -0500
Subject: [PATCH] [LinkerWrapper] Add an overriding option for debugging

Summary:
One of the downsides of the linker wrapper is that it made debugging
more difficult. It is very powerful in that it can resolve a lot of
input matching and library handling that could not be done before.
However, the old method allowed users to simply copy-paste the script
files to modify the output and test it.

This patch attempts to make it easier to debug changes by letting the
user override all the linker inputs. That is, we provide a user-created
binary that is treated like the final output of the device link step.
The intended use-case is for using `-save-temps` to get some IR, then
modifying the IR and sticking it back in to see if it exhibits the old
failures.
---
 clang/docs/ClangLinkerWrapper.rst             | 38 ++++++++++++++++
 clang/test/Driver/linker-wrapper.c            |  7 +++
 .../ClangLinkerWrapper.cpp                    | 43 +++++++++++++++++++
 .../clang-linker-wrapper/LinkerWrapperOpts.td |  4 ++
 4 files changed, 92 insertions(+)

diff --git a/clang/docs/ClangLinkerWrapper.rst 
b/clang/docs/ClangLinkerWrapper.rst
index 3bef558475735..5b6c1b1e362f1 100644
--- a/clang/docs/ClangLinkerWrapper.rst
+++ b/clang/docs/ClangLinkerWrapper.rst
@@ -46,6 +46,8 @@ only for the linker wrapper will be forwarded to the wrapped 
linker job.
     -l <libname>           Search for library <libname>
     --opt-level=<O0, O1, O2, or O3>
                            Optimization level for LTO
+    --override-image=<kind=file>
+                            Uses the provided file as if it were the output of 
the device link step
     -o <path>              Path to file to write output
     --pass-remarks-analysis=<value>
                            Pass remarks for LTO
@@ -87,6 +89,42 @@ other. Generally, this requires that the target triple and 
architecture match.
 An exception is made when the architecture is listed as ``generic``, which will
 cause it be linked with any other device code with the same target triple.
 
+Debugging
+=========
+
+The linker wrapper performs a lot of steps internally, such as input matching, 
+symbol resolution, and image registration. This makes it difficult to debug in 
+some scenarios. The behavior of the linker-wrapper is controlled mostly through
+metadata, described in `clang documentation
+<https://clang.llvm.org/docs/OffloadingDesign.html>`_. Intermediate output can 
+be obtained from the linker-wrapper using the ``--save-temps`` flag. These 
files 
+can then be modified.
+
+.. code-block:: sh
+
+  $> clang openmp.c -fopenmp --offload-arch=gfx90a -c
+  $> clang openmp.o -fopenmp --offload-arch=gfx90a -Wl,--save-temps
+  $> ; Modify temp files.
+  $> llvm-objcopy --update-section=.llvm.offloading=out.bc openmp.o
+
+Doing this will allow you to override one of the input files by replacing its 
+embedded offloading metadata with a user-modified version. However, this will 
be 
+more difficult when there are multiple input files. For a very large hammer, 
the 
+``--override-image=<kind>=<file>`` flag can be used.
+
+In the following example, we use the ``--save-temps`` to obtain the LLVM-IR 
just 
+before running the backend. We then modify it to test altered behavior, and 
then 
+compile it to a binary. This can then be passed to the linker-wrapper which 
will 
+then ignore all embedded metadata and use the provided image as if it were the 
+result of the device linking phase.
+
+.. code-block:: sh
+
+  $> clang openmp.c -fopenmp --offload-arch=gfx90a -Wl,--save-temps
+  $> ; Modify temp files.
+  $> clang --target=amdgcn-amd-amdhsa -mcpu=gfx90a -nogpulib out.bc -o a.out
+  $> clang openmp.c -fopenmp --offload-arch=gfx90a 
-Wl,--override-image=openmp=a.out
+
 Example
 =======
 
diff --git a/clang/test/Driver/linker-wrapper.c 
b/clang/test/Driver/linker-wrapper.c
index 51bf98b2ed39d..0d05f913aad63 100644
--- a/clang/test/Driver/linker-wrapper.c
+++ b/clang/test/Driver/linker-wrapper.c
@@ -226,3 +226,10 @@ __attribute__((visibility("protected"), used)) int x;
 // RELOCATABLE-LINK-CUDA: fatbinary{{.*}} -64 --create {{.*}}.fatbin 
--image=profile=sm_89,file={{.*}}.img
 // RELOCATABLE-LINK-CUDA: /usr/bin/ld.lld{{.*}}-r
 // RELOCATABLE-LINK-CUDA: llvm-objcopy{{.*}}a.out --remove-section 
.llvm.offloading
+
+// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
+// RUN:   --linker-path=/usr/bin/ld --override=image=openmp=%t.o %t.o -o a.out 
2>&1 \
+// RUN: | FileCheck %s --check-prefix=OVERRIDE
+// OVERRIDE-NOT: clang
+// OVERRIDE: /usr/bin/ld
diff --git a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp 
b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
index 69d8cb446fad1..aee98c5a524ad 100644
--- a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -1149,6 +1149,39 @@ DerivedArgList getLinkerArgs(ArrayRef<OffloadFile> Input,
   return DAL;
 }
 
+Error handleOverrideImages(
+    const InputArgList &Args,
+    DenseMap<OffloadKind, SmallVector<OffloadingImage>> &Images) {
+  for (StringRef Arg : Args.getAllArgValues(OPT_override_image)) {
+    OffloadKind Kind = getOffloadKind(Arg.split("=").first);
+    StringRef Filename = Arg.split("=").second;
+
+    ErrorOr<std::unique_ptr<MemoryBuffer>> BufferOrErr =
+        MemoryBuffer::getFileOrSTDIN(Filename);
+    if (std::error_code EC = BufferOrErr.getError())
+      return createFileError(Filename, EC);
+
+    Expected<std::unique_ptr<ObjectFile>> ElfOrErr =
+        ObjectFile::createELFObjectFile(**BufferOrErr,
+                                        /*InitContent=*/false);
+    if (!ElfOrErr)
+      return ElfOrErr.takeError();
+    ObjectFile &Elf = **ElfOrErr;
+
+    OffloadingImage TheImage{};
+    TheImage.TheImageKind = IMG_Object;
+    TheImage.TheOffloadKind = Kind;
+    TheImage.StringData["triple"] =
+        Args.MakeArgString(Elf.makeTriple().getTriple());
+    if (std::optional<StringRef> CPU = Elf.tryGetCPUName())
+      TheImage.StringData["arch"] = Args.MakeArgString(*CPU);
+    TheImage.Image = std::move(*BufferOrErr);
+
+    Images[Kind].emplace_back(std::move(TheImage));
+  }
+  return Error::success();
+}
+
 /// Transforms all the extracted offloading input files into an image that can
 /// be registered by the runtime.
 Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
@@ -1158,6 +1191,12 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
 
   std::mutex ImageMtx;
   DenseMap<OffloadKind, SmallVector<OffloadingImage>> Images;
+
+  // Initialize the images with any overriding inputs.
+  if (Args.hasArg(OPT_override_image))
+    if (Error Err = handleOverrideImages(Args, Images))
+      return Err;
+
   auto Err = parallelForEachError(LinkerInputFiles, [&](auto &Input) -> Error {
     llvm::TimeTraceScope TimeScope("Link device input");
 
@@ -1439,6 +1478,10 @@ Expected<SmallVector<SmallVector<OffloadFile>>>
 getDeviceInput(const ArgList &Args) {
   llvm::TimeTraceScope TimeScope("ExtractDeviceCode");
 
+  // Skip all the input if the user is overriding the output.
+  if (Args.hasArg(OPT_override_image))
+    return SmallVector<SmallVector<OffloadFile>>();
+
   StringRef Root = Args.getLastArgValue(OPT_sysroot_EQ);
   SmallVector<StringRef> LibraryPaths;
   for (const opt::Arg *Arg : Args.filtered(OPT_library_path, OPT_libpath))
diff --git a/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td 
b/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
index 0a8bd541c4524..eb31b98a3f545 100644
--- a/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
+++ b/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
@@ -74,6 +74,10 @@ def wrapper_jobs : Joined<["--"], "wrapper-jobs=">,
   Flags<[WrapperOnlyOption]>, MetaVarName<"<number>">,
   HelpText<"Sets the number of parallel jobs to use for device linking">;
 
+def override_image : Joined<["--"], "override-image=">,
+  Flags<[WrapperOnlyOption]>, MetaVarName<"<kind=file>">,
+  HelpText<"Uses the provided file as if it were the output of the device link 
step">;
+
 // Flags passed to the device linker.
 def arch_EQ : Joined<["--"], "arch=">,
   Flags<[DeviceOnlyOption, HelpHidden]>, MetaVarName<"<arch>">,

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to