[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain skeleton for AMDGPU

2021-01-20 Thread Pushpinder Singh via Phabricator via cfe-commits
pdhaliwal updated this revision to Diff 317810.
pdhaliwal added a comment.
Herald added a subscriber: mgorny.

> Won't this just prevent us from building clang due to the missing cmake 
> changes?

It compiles and builds fine, however, I wasn't actually aware such sanity 
checking being present. It turns out
the unknown files inside llvm/ will lead cmake to report error but such 
reporting will not happen inside clang. Maybe such checks
were not enabled inside clang. Anyways thanks for pointing out. I will keep 
that in mind in future.

The idea for this patch was basically to introduce AMDGPUToolChain classes 
without much of the functionality in order
to keep its size in check. And the second patch would have integrated the 
toolchain with driver along with testing.
But during the intermediate time of the two patches, bare files would have 
existed (never built and tested).

I have updated this patch to now include somewhat functional driver along with 
tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94961/new/

https://reviews.llvm.org/D94961

Files:
  clang/lib/Driver/CMakeLists.txt
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
  clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
  clang/test/Driver/amdgpu-openmp-toolchain.c

Index: clang/test/Driver/amdgpu-openmp-toolchain.c
===
--- /dev/null
+++ clang/test/Driver/amdgpu-openmp-toolchain.c
@@ -0,0 +1,35 @@
+// REQUIRES: amdgpu-registered-target
+// RUN:   %clang -### -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target -march=gfx906 %s 2>&1 \
+// RUN:   | FileCheck %s
+
+// verify the tools invocations
+// CHECK: clang{{.*}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-x" "c"{{.*}}
+// CHECK: clang{{.*}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-x" "ir"{{.*}}
+// CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu" "-emit-llvm-bc" "-emit-llvm-uselists"{{.*}}"-target-cpu" "gfx906" "-fcuda-is-device"{{.*}}"-fopenmp" "-fopenmp-cuda-parallel-target-regions"{{.*}}"-fopenmp-is-device"{{.*}}"-o" {{.*}}amdgpu-openmp-toolchain-{{.*}}.bc{{.*}}"-x" "c"{{.*}}amdgpu-openmp-toolchain.c{{.*}}
+// CHECK: llvm-link{{.*}}amdgpu-openmp-toolchain-{{.*}}.bc" "-o" "{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-linked-{{.*}}.bc"
+// CHECK: opt{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-linked-{{.*}} "-mtriple=amdgcn-amd-amdhsa" "-mcpu=gfx906" "-o"{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-optimized-{{.*}}.bc"
+// CHECK: llc{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-optimized-{{.*}}.bc" "-mtriple=amdgcn-amd-amdhsa" "-mcpu=gfx906" "-filetype=obj" "-o"{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-{{.*}}.o"
+// CHECK: lld{{.*}}"-flavor" "gnu" "--no-undefined" "-shared" "-o"{{.*}}amdgpu-openmp-toolchain-{{.*}}.out" "{{.*}}amdgpu-openmp-toolchain-{{.*}}-gfx906-{{.*}}.o"
+// CHECK: clang-offload-wrapper{{.*}}"-target" "x86_64-unknown-linux-gnu" "-o" "{{.*}}a-{{.*}}.bc" {{.*}}amdgpu-openmp-toolchain-{{.*}}.out"
+// CHECK: clang{{.*}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-o" "{{.*}}a-{{.*}}.o" "-x" "ir" "{{.*}}a-{{.*}}.bc"
+// CHECK: ld{{.*}}"-o" "a.out"{{.*}}"{{.*}}amdgpu-openmp-toolchain-{{.*}}.o" "{{.*}}a-{{.*}}.o" "-lomp" "-lomptarget"
+
+// RUN:   %clang -ccc-print-phases -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target -march=gfx906 %s 2>&1 \
+// RUN:   | FileCheck --check-prefix=CHECK-PHASES %s
+// phases
+// CHECK-PHASES: 0: input, "{{.*}}amdgpu-openmp-toolchain.c", c, (host-openmp)
+// CHECK-PHASES: 1: preprocessor, {0}, cpp-output, (host-openmp)
+// CHECK-PHASES: 2: compiler, {1}, ir, (host-openmp)
+// CHECK-PHASES: 3: backend, {2}, assembler, (host-openmp)
+// CHECK-PHASES: 4: assembler, {3}, object, (host-openmp)
+// CHECK-PHASES: 5: input, "{{.*}}amdgpu-openmp-toolchain.c", c, (device-openmp)
+// CHECK-PHASES: 6: preprocessor, {5}, cpp-output, (device-openmp)
+// CHECK-PHASES: 7: compiler, {6}, ir, (device-openmp)
+// CHECK-PHASES: 8: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (amdgcn-amd-amdhsa)" {7}, ir
+// CHECK-PHASES: 9: linker, {8}, image, (device-openmp)
+// CHECK-PHASES: 10: offload, "device-openmp (amdgcn-amd-amdhsa)" {9}, image
+// CHECK-PHASES: 11: clang-offload-wrapper, {10}, ir, (host-openmp)
+// CHECK-PHASES: 12: backend, {11}, assembler, (host-openmp)
+// CHECK-PHASES: 13: assembler, {12}, object, (host-openmp)
+// CHECK-PHASES: 14: linker, {4, 13}, image, (host-openmp)
+
Index: clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
===
--- /dev/null
+++ clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
@@ -0,0 +1,123 @@
+//===- AMDGPUOpenMP.h - AMDGPUOpenMP ToolChain Implementation -*- C++ -*---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// 

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain skeleton for AMDGPU

2021-01-19 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added a comment.

In D94961#2506460 , @JonChesterfield 
wrote:

> This patch was written, roughly, by:
>
> - copying the known-working openmp driver from rocm into the trunk source tree
> - deleting lots of stuff that didn't look necessary
> - deleting some stuff that is broadly necessary, but the specifics are up for 
> debate
>
> The idea is to move language-independent but amdgcn-specific code into 
> ROCMToolChain. Some has already gone in, others (like computeMSVCVersion) 
> will likely move too.
>
> Regarding the rest of the end to end stack:
>
> - host plugin works, same code in trunk / rocm / aomp
> - device plugin will work once it's building as openmp, modulo printf and 
> malloc
> - compiler backend will work for spmd kernels today, will work for generic 
> kernels after D94648  or equivalent lands
>
> Regarding tests (which need the unimplemented bit filled in with the next 
> patch):
>
> - Runtime tests (for spmd kernels) are working locally with a jury rigged 
> devicertl
> - Codegen tests are proving awkward to update. Dropping line number would 
> help, but there's still a difference in addrspace cast distribution. I'm 
> hoping the scripts involved in generating the nvptx cases can be adapted.

Could you add the tests for the tools invocation to check that the newly added 
classes/functions correctly translate options/flags?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94961/new/

https://reviews.llvm.org/D94961

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain skeleton for AMDGPU

2021-01-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment.

Won't this just prevent us from building clang due to the missing cmake 
changes? We need somewhat testable chunks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94961/new/

https://reviews.llvm.org/D94961

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain skeleton for AMDGPU

2021-01-19 Thread Pushpinder Singh via Phabricator via cfe-commits
pdhaliwal updated this revision to Diff 317553.
pdhaliwal added a comment.

Fix clang-tidy error


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94961/new/

https://reviews.llvm.org/D94961

Files:
  clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
  clang/lib/Driver/ToolChains/AMDGPUOpenMP.h

Index: clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
===
--- /dev/null
+++ clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
@@ -0,0 +1,85 @@
+//===- AMDGPUOpenMP.h - AMDGPUOpenMP ToolChain Implementation -*- C++ -*---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_AMDGPUOPENMP_H
+#define LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_AMDGPUOPENMP_H
+
+#include "AMDGPU.h"
+#include "clang/Driver/Tool.h"
+#include "clang/Driver/ToolChain.h"
+
+namespace clang {
+namespace driver {
+
+namespace tools {
+
+namespace AMDGCN {
+// Runs llvm-link/opt/llc/lld, which links multiple LLVM bitcode, together with
+// device library, then compiles it to ISA in a shared object.
+class LLVM_LIBRARY_VISIBILITY OpenMPLinker : public Tool {
+public:
+  OpenMPLinker(const ToolChain )
+  : Tool("AMDGCN::OpenMPLinker", "amdgcn-link", TC) {}
+
+  bool hasIntegratedCPP() const override { return false; }
+
+  void ConstructJob(Compilation , const JobAction ,
+const InputInfo , const InputInfoList ,
+const llvm::opt::ArgList ,
+const char *LinkingOutput) const override;
+};
+
+} // end namespace AMDGCN
+} // end namespace tools
+
+namespace toolchains {
+
+class LLVM_LIBRARY_VISIBILITY AMDGPUOpenMPToolChain final
+: public ROCMToolChain {
+public:
+  AMDGPUOpenMPToolChain(const Driver , const llvm::Triple ,
+const ToolChain ,
+const llvm::opt::ArgList );
+
+  const llvm::Triple *getAuxTriple() const override {
+return ();
+  }
+
+  llvm::opt::DerivedArgList *
+  TranslateArgs(const llvm::opt::DerivedArgList , StringRef BoundArch,
+Action::OffloadKind DeviceOffloadKind) const override;
+  void
+  addClangTargetOptions(const llvm::opt::ArgList ,
+llvm::opt::ArgStringList ,
+Action::OffloadKind DeviceOffloadKind) const override;
+
+  void addClangWarningOptions(llvm::opt::ArgStringList ) const override;
+  CXXStdlibType GetCXXStdlibType(const llvm::opt::ArgList ) const override;
+  void
+  AddClangSystemIncludeArgs(const llvm::opt::ArgList ,
+llvm::opt::ArgStringList ) const override;
+  void AddIAMCUIncludeArgs(const llvm::opt::ArgList ,
+   llvm::opt::ArgStringList ) const override;
+
+  SanitizerMask getSupportedSanitizers() const override;
+
+  VersionTuple
+  computeMSVCVersion(const Driver *D,
+ const llvm::opt::ArgList ) const override;
+
+  const ToolChain 
+
+protected:
+  Tool *buildLinker() const override;
+};
+
+} // end namespace toolchains
+} // end namespace driver
+} // end namespace clang
+
+#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_AMDGPUOPENMP_H
Index: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
===
--- /dev/null
+++ clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -0,0 +1,116 @@
+//===- AMDGPUOpenMP.cpp - AMDGPUOpenMP ToolChain Implementation -*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "AMDGPUOpenMP.h"
+#include "AMDGPU.h"
+#include "CommonArgs.h"
+#include "InputInfo.h"
+#include "clang/Driver/Compilation.h"
+#include "clang/Driver/Driver.h"
+#include "clang/Driver/Options.h"
+
+using namespace clang::driver;
+using namespace clang::driver::toolchains;
+using namespace clang::driver::tools;
+using namespace clang;
+using namespace llvm::opt;
+
+// For amdgcn the inputs of the linker job are device bitcode and output is
+// object file. It calls llvm-link, opt, llc, then lld steps.
+void AMDGCN::OpenMPLinker::ConstructJob(Compilation , const JobAction ,
+const InputInfo ,
+const InputInfoList ,
+const ArgList ,
+const char *LinkingOutput) const {
+  llvm_unreachable("Not implemented yet");
+}
+
+AMDGPUOpenMPToolChain::AMDGPUOpenMPToolChain(const Driver ,
+ const 

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain skeleton for AMDGPU

2021-01-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment.

This patch was written, roughly, by:

- copying the known-working openmp driver from rocm into the trunk source tree
- deleting lots of stuff that didn't look necessary
- deleting some stuff that is broadly necessary, but the specifics are up for 
debate

The idea is to move language-independent but amdgcn-specific code into 
ROCMToolChain. Some has already gone in, others (like computeMSVCVersion) will 
likely move too.

Regarding the rest of the end to end stack:

- host plugin works, same code in trunk / rocm / aomp
- device plugin will work once it's building as openmp, modulo printf and malloc
- compiler backend will work for spmd kernels today, will work for generic 
kernels after D94648  or equivalent lands

Regarding tests (which need the unimplemented bit filled in with the next 
patch):

- Runtime tests (for spmd kernels) are working locally with a jury rigged 
devicertl
- Codegen tests are proving awkward to update. Dropping line number would help, 
but there's still a difference in addrspace cast distribution. I'm hoping the 
scripts involved in generating the nvptx cases can be adapted.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94961/new/

https://reviews.llvm.org/D94961

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain skeleton for AMDGPU

2021-01-19 Thread Pushpinder Singh via Phabricator via cfe-commits
pdhaliwal created this revision.
pdhaliwal added reviewers: jdoerfert, grokos, JonChesterfield, ronlieb, ABataev.
Herald added subscribers: kerbowa, guansong, t-tye, tpr, dstuttard, yaxunl, 
nhaehnle, jvesely, kzhuravl.
pdhaliwal requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1, wdng.
Herald added a project: clang.

This patch adds AMDGPUOpenMPToolChain for supporting OpenMP
offloading to AMD GPU's.

This is the first patch in the series which provides the basic
skeleton, few of the methods are marked as llvm_unreachable to 
keep it simple as next patch will implement those.

Originally authored by Greg Rodgers


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D94961

Files:
  clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
  clang/lib/Driver/ToolChains/AMDGPUOpenMP.h

Index: clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
===
--- /dev/null
+++ clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
@@ -0,0 +1,85 @@
+//===- AMDGPUOpenMP.h - AMDGPUOpenMP ToolChain Implementation -*- C++ -*---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_AMDGPUOPENMP_H
+#define LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_AMDGPUOPENMP_H
+
+#include "AMDGPU.h"
+#include "clang/Driver/Tool.h"
+#include "clang/Driver/ToolChain.h"
+
+namespace clang {
+namespace driver {
+
+namespace tools {
+
+namespace AMDGCN {
+// Runs llvm-link/opt/llc/lld, which links multiple LLVM bitcode, together with
+// device library, then compiles it to ISA in a shared object.
+class LLVM_LIBRARY_VISIBILITY OpenMPLinker : public Tool {
+public:
+  OpenMPLinker(const ToolChain )
+  : Tool("AMDGCN::OpenMPLinker", "amdgcn-link", TC) {}
+
+  bool hasIntegratedCPP() const override { return false; }
+
+  void ConstructJob(Compilation , const JobAction ,
+const InputInfo , const InputInfoList ,
+const llvm::opt::ArgList ,
+const char *LinkingOutput) const override;
+};
+
+} // end namespace AMDGCN
+} // end namespace tools
+
+namespace toolchains {
+
+class LLVM_LIBRARY_VISIBILITY AMDGPUOpenMPToolChain final
+: public ROCMToolChain {
+public:
+  AMDGPUOpenMPToolChain(const Driver , const llvm::Triple ,
+const ToolChain , const llvm::opt::ArgList ,
+const Action::OffloadKind OK);
+
+  const llvm::Triple *getAuxTriple() const override {
+return ();
+  }
+
+  llvm::opt::DerivedArgList *
+  TranslateArgs(const llvm::opt::DerivedArgList , StringRef BoundArch,
+Action::OffloadKind DeviceOffloadKind) const override;
+  void
+  addClangTargetOptions(const llvm::opt::ArgList ,
+llvm::opt::ArgStringList ,
+Action::OffloadKind DeviceOffloadKind) const override;
+
+  void addClangWarningOptions(llvm::opt::ArgStringList ) const override;
+  CXXStdlibType GetCXXStdlibType(const llvm::opt::ArgList ) const override;
+  void
+  AddClangSystemIncludeArgs(const llvm::opt::ArgList ,
+llvm::opt::ArgStringList ) const override;
+  void AddIAMCUIncludeArgs(const llvm::opt::ArgList ,
+   llvm::opt::ArgStringList ) const override;
+
+  SanitizerMask getSupportedSanitizers() const override;
+
+  VersionTuple
+  computeMSVCVersion(const Driver *D,
+ const llvm::opt::ArgList ) const override;
+
+  const ToolChain 
+
+protected:
+  Tool *buildLinker() const override;
+};
+
+} // end namespace toolchains
+} // end namespace driver
+} // end namespace clang
+
+#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_AMDGPUOPENMP_H
Index: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
===
--- /dev/null
+++ clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -0,0 +1,117 @@
+//===- AMDGPUOpenMP.cpp - AMDGPUOpenMP ToolChain Implementation -*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "AMDGPUOpenMP.h"
+#include "AMDGPU.h"
+#include "CommonArgs.h"
+#include "InputInfo.h"
+#include "clang/Driver/Compilation.h"
+#include "clang/Driver/Driver.h"
+#include "clang/Driver/Options.h"
+
+using namespace clang::driver;
+using namespace clang::driver::toolchains;
+using namespace clang::driver::tools;
+using namespace clang;
+using namespace llvm::opt;
+
+// For amdgcn the inputs of the linker job are device bitcode and output is
+// object file. It calls