[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-12 Thread Pierre van Houtryve via cfe-commits

https://github.com/Pierre-vh closed 
https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-09 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-09 Thread Pierre van Houtryve via cfe-commits

Pierre-vh wrote:

> mad_mix

I added run lines to `mad-mix.ll` and it behaves as expected: no fma/mad_mix 
emitted

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-08 Thread Tony Tye via cfe-commits

https://github.com/t-tye approved this pull request.


https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-08 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm commented:

I think this needs codegen tests for the gfx900 vs. gfx906 mad_mix/fma_fix 
issue 

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits

Pierre-vh wrote:

@t-tye Can you please approve then? Otherwise the diff still shows a red 
"Changes requested" warning :) Thanks
@arsenm Please also approve if there are no more comments

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Tony Tye via cfe-commits

https://github.com/t-tye commented:

Documentation LGTM. Thanks.

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits

https://github.com/Pierre-vh updated 
https://github.com/llvm/llvm-project/pull/76955

>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout 
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH 1/6] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets

These generic targets include multiple GPUs and will, in the future, provide a 
way to build once and run on multiple GPU, at the cost of less optimization 
opportunities.
Note that this is just doing the compiler side of things, device libs an 
runtimes/loader/etc. don't know about these targets yet, so none of them 
actually work in practice right now. This is just the initial commit to make 
LLVM aware of them.

No docs in this patch either as I plan to do it all in a follow-up patch.
---
 clang/lib/Basic/Targets/AMDGPU.cpp|  20 +-
 clang/test/Driver/amdgpu-macros.cl|   5 +
 clang/test/Driver/amdgpu-mcpu.cl  |  10 +
 llvm/docs/AMDGPUUsage.rst | 325 +-
 llvm/include/llvm/BinaryFormat/ELF.h  |   6 +-
 llvm/include/llvm/TargetParser/TargetParser.h |  10 +
 llvm/lib/Object/ELFObjectFile.cpp |  10 +
 llvm/lib/ObjectYAML/ELFYAML.cpp   |   4 +
 llvm/lib/Target/AMDGPU/AMDGPU.td  |  87 +++--
 llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp   |   6 +
 .../AMDGPURemoveIncompatibleFunctions.cpp |   6 +-
 llvm/lib/Target/AMDGPU/GCNProcessors.td   |  22 ++
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |   4 +
 .../MCTargetDesc/AMDGPUTargetStreamer.cpp |  26 ++
 llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h |  11 +
 llvm/lib/TargetParser/TargetParser.cpp|  46 +++
 .../GlobalISel/llvm.amdgcn.workitem.id.ll |   1 +
 .../CodeGen/AMDGPU/directive-amdgcn-target.ll |  14 +
 .../CodeGen/AMDGPU/elf-header-flags-mach.ll   |  10 +
 llvm/test/CodeGen/AMDGPU/gds-allocation.ll|   1 +
 llvm/test/CodeGen/AMDGPU/gds-atomic.ll|   1 +
 .../AMDGPU/generic-targets-require-v6.ll  |  18 +
 .../AMDGPU/hsa-generic-target-features.ll |  31 ++
 .../llvm.amdgcn.image.gather4.d16.dim.ll  |   3 +
 .../AMDGPU/llvm.amdgcn.image.sample.dim.ll|   3 +
 .../AMDGPU/unsupported-image-sample.ll|  12 +-
 .../Object/AMDGPU/elf-header-flags-mach.yaml  |  29 ++
 .../llvm-objdump/ELF/AMDGPU/subtarget.ll  |  20 ++
 .../llvm-readobj/ELF/AMDGPU/elf-headers.test  |  12 +
 llvm/tools/llvm-readobj/ELFDumper.cpp | 128 +++
 30 files changed, 689 insertions(+), 192 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/generic-targets-require-v6.ll
 create mode 100644 llvm/test/CodeGen/AMDGPU/hsa-generic-target-features.ll

diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp 
b/clang/lib/Basic/Targets/AMDGPU.cpp
index 141501e8a4d9a1..799634ccec7ba5 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -279,13 +279,25 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions 
&Opts,
   if (GPUKind == llvm::AMDGPU::GK_NONE && !IsHIPHost)
 return;
 
-  StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)
-  : getArchNameR600(GPUKind);
+  std::string CanonName = (isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)
+ : getArchNameR600(GPUKind))
+  .str();
+
+  // Sanitize the name of generic targets.
+  // e.g. gfx10.1-generic -> gfx10_1_generic
+  if (GPUKind >= llvm::AMDGPU::GK_AMDGCN_GENERIC_FIRST &&
+  GPUKind <= llvm::AMDGPU::GK_AMDGCN_GENERIC_LAST) {
+std::replace(CanonName.begin(), CanonName.end(), '.', '_');
+std::replace(CanonName.begin(), CanonName.end(), '-', '_');
+  }
+
   Builder.defineMacro(Twine("__") + Twine(CanonName) + Twine("__"));
   // Emit macros for gfx family e.g. gfx906 -> __GFX9__, gfx1030 -> __GFX10___
   if (isAMDGCN(getTriple()) && !IsHIPHost) {
-assert(CanonName.starts_with("gfx") && "Invalid amdgcn canonical name");
-Builder.defineMacro(Twine("__") + Twine(CanonName.drop_back(2).upper()) +
+assert(StringRef(CanonName).starts_with("gfx") &&
+   "Invalid amdgcn canonical name");
+StringRef CanonFamilyName = getArchFamilyNameAMDGCN(GPUKind);
+Builder.defineMacro(Twine("__") + Twine(CanonFamilyName.upper()) +
 Twine("__"));
 Builder.defineMacro("__amdgcn_processor__",
 Twine("\"") + Twine(CanonName) + Twine("\""));
diff --git a/clang/test/Driver/amdgpu-macros.cl 
b/clang/test/Driver/amdgpu-macros.cl
index 81c22af460d12d..3b10444ef71d36 100644
--- a/clang/test/Driver/amdgpu-macros.cl
+++ b/clang/test/Driver/amdgpu-macros.cl
@@ -131,6 +131,11 @@
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1200 %s 2>&1 | FileCheck 
--check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1200 
-DFAMILY=GFX12
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1201 %s 2>&1 | FileCheck 
--check-prefixes=ARCH-GCN,FAST_FMAF

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits


@@ -520,6 +520,102 @@ Every processor supports every OS ABI (see 
:ref:`amdgpu-os`) with the following
 
  === ===  = = 
=== === ==
 
+Generic processors allow execution of a single code objects on any of the 
processors that
+it supports. Such code objects may not perform as well as those for the 
non-generic processors.
+
+Generic processors are only available on code object V6 and above (see 
:ref:`amdgpu-elf-code-object`).
+
+Generic processor code objects are versioned (see 
:ref:`amdgpu-elf-header-e_flags-table-v6-onwards`).
+The version number is used by runtimes to determine if a code object can be 
run on a specific agent.

Pierre-vh wrote:

I rephrased it a bit (e.g. member -> supported processor) but I mostly followed 
your suggestion

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits

https://github.com/Pierre-vh updated 
https://github.com/llvm/llvm-project/pull/76955

>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout 
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH 1/5] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets

These generic targets include multiple GPUs and will, in the future, provide a 
way to build once and run on multiple GPU, at the cost of less optimization 
opportunities.
Note that this is just doing the compiler side of things, device libs an 
runtimes/loader/etc. don't know about these targets yet, so none of them 
actually work in practice right now. This is just the initial commit to make 
LLVM aware of them.

No docs in this patch either as I plan to do it all in a follow-up patch.
---
 clang/lib/Basic/Targets/AMDGPU.cpp|  20 +-
 clang/test/Driver/amdgpu-macros.cl|   5 +
 clang/test/Driver/amdgpu-mcpu.cl  |  10 +
 llvm/docs/AMDGPUUsage.rst | 325 +-
 llvm/include/llvm/BinaryFormat/ELF.h  |   6 +-
 llvm/include/llvm/TargetParser/TargetParser.h |  10 +
 llvm/lib/Object/ELFObjectFile.cpp |  10 +
 llvm/lib/ObjectYAML/ELFYAML.cpp   |   4 +
 llvm/lib/Target/AMDGPU/AMDGPU.td  |  87 +++--
 llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp   |   6 +
 .../AMDGPURemoveIncompatibleFunctions.cpp |   6 +-
 llvm/lib/Target/AMDGPU/GCNProcessors.td   |  22 ++
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |   4 +
 .../MCTargetDesc/AMDGPUTargetStreamer.cpp |  26 ++
 llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h |  11 +
 llvm/lib/TargetParser/TargetParser.cpp|  46 +++
 .../GlobalISel/llvm.amdgcn.workitem.id.ll |   1 +
 .../CodeGen/AMDGPU/directive-amdgcn-target.ll |  14 +
 .../CodeGen/AMDGPU/elf-header-flags-mach.ll   |  10 +
 llvm/test/CodeGen/AMDGPU/gds-allocation.ll|   1 +
 llvm/test/CodeGen/AMDGPU/gds-atomic.ll|   1 +
 .../AMDGPU/generic-targets-require-v6.ll  |  18 +
 .../AMDGPU/hsa-generic-target-features.ll |  31 ++
 .../llvm.amdgcn.image.gather4.d16.dim.ll  |   3 +
 .../AMDGPU/llvm.amdgcn.image.sample.dim.ll|   3 +
 .../AMDGPU/unsupported-image-sample.ll|  12 +-
 .../Object/AMDGPU/elf-header-flags-mach.yaml  |  29 ++
 .../llvm-objdump/ELF/AMDGPU/subtarget.ll  |  20 ++
 .../llvm-readobj/ELF/AMDGPU/elf-headers.test  |  12 +
 llvm/tools/llvm-readobj/ELFDumper.cpp | 128 +++
 30 files changed, 689 insertions(+), 192 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/generic-targets-require-v6.ll
 create mode 100644 llvm/test/CodeGen/AMDGPU/hsa-generic-target-features.ll

diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp 
b/clang/lib/Basic/Targets/AMDGPU.cpp
index 141501e8a4d9a..799634ccec7ba 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -279,13 +279,25 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions 
&Opts,
   if (GPUKind == llvm::AMDGPU::GK_NONE && !IsHIPHost)
 return;
 
-  StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)
-  : getArchNameR600(GPUKind);
+  std::string CanonName = (isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)
+ : getArchNameR600(GPUKind))
+  .str();
+
+  // Sanitize the name of generic targets.
+  // e.g. gfx10.1-generic -> gfx10_1_generic
+  if (GPUKind >= llvm::AMDGPU::GK_AMDGCN_GENERIC_FIRST &&
+  GPUKind <= llvm::AMDGPU::GK_AMDGCN_GENERIC_LAST) {
+std::replace(CanonName.begin(), CanonName.end(), '.', '_');
+std::replace(CanonName.begin(), CanonName.end(), '-', '_');
+  }
+
   Builder.defineMacro(Twine("__") + Twine(CanonName) + Twine("__"));
   // Emit macros for gfx family e.g. gfx906 -> __GFX9__, gfx1030 -> __GFX10___
   if (isAMDGCN(getTriple()) && !IsHIPHost) {
-assert(CanonName.starts_with("gfx") && "Invalid amdgcn canonical name");
-Builder.defineMacro(Twine("__") + Twine(CanonName.drop_back(2).upper()) +
+assert(StringRef(CanonName).starts_with("gfx") &&
+   "Invalid amdgcn canonical name");
+StringRef CanonFamilyName = getArchFamilyNameAMDGCN(GPUKind);
+Builder.defineMacro(Twine("__") + Twine(CanonFamilyName.upper()) +
 Twine("__"));
 Builder.defineMacro("__amdgcn_processor__",
 Twine("\"") + Twine(CanonName) + Twine("\""));
diff --git a/clang/test/Driver/amdgpu-macros.cl 
b/clang/test/Driver/amdgpu-macros.cl
index 81c22af460d12..3b10444ef71d3 100644
--- a/clang/test/Driver/amdgpu-macros.cl
+++ b/clang/test/Driver/amdgpu-macros.cl
@@ -131,6 +131,11 @@
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1200 %s 2>&1 | FileCheck 
--check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1200 
-DFAMILY=GFX12
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1201 %s 2>&1 | FileCheck 
--check-prefixes=ARCH-GCN,FAST_FMAF %s 

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits

Pierre-vh wrote:

For the MD changes, it's just to describe the version increment, nothing else. 
I think describing is important as the V6 diff already updated the 
amdhsa.version.
If amdhsa.version didn't need to change then i need to fix that first, and then 
we can remove the V6 MD section

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Tony Tye via cfe-commits


@@ -1642,80 +1746,118 @@ The AMDGPU backend uses the following ELF header:
  ``EF_AMDGPU_FEATURE_SRAMECC_ON_V4``  0xc00 SRAMECC enabled.
   = 
===
 
+  .. table:: AMDGPU ELF Header ``e_flags`` for Code Object V6 and After
+ :name: amdgpu-elf-header-e_flags-table-v6-onwards
+
+  == 
=
+ Name Value  Description
+  == 
=
+ ``EF_AMDGPU_MACH``   0x0ff  AMDGPU processor 
selection
+ mask for
+ 
``EF_AMDGPU_MACH_xxx`` values
+ defined in
+ 
:ref:`amdgpu-ef-amdgpu-mach-table`.
+ ``EF_AMDGPU_FEATURE_XNACK_V4``   0x300  XNACK selection 
mask for
+ 
``EF_AMDGPU_FEATURE_XNACK_*_V4``
+ values.
+ ``EF_AMDGPU_FEATURE_XNACK_UNSUPPORTED_V4``   0x000  XNACK unsupported.
+ ``EF_AMDGPU_FEATURE_XNACK_ANY_V4``   0x100  XNACK can have 
any value.
+ ``EF_AMDGPU_FEATURE_XNACK_OFF_V4``   0x200  XNACK disabled.
+ ``EF_AMDGPU_FEATURE_XNACK_ON_V4``0x300  XNACK enabled.
+ ``EF_AMDGPU_FEATURE_SRAMECC_V4`` 0xc00  SRAMECC selection 
mask for
+ 
``EF_AMDGPU_FEATURE_SRAMECC_*_V4``
+ values.
+ ``EF_AMDGPU_FEATURE_SRAMECC_UNSUPPORTED_V4`` 0x000  SRAMECC 
unsupported.
+ ``EF_AMDGPU_FEATURE_SRAMECC_ANY_V4`` 0x400  SRAMECC can have 
any value.
+ ``EF_AMDGPU_FEATURE_SRAMECC_OFF_V4`` 0x800  SRAMECC disabled,
+ ``EF_AMDGPU_FEATURE_SRAMECC_ON_V4``  0xc00  SRAMECC enabled.
+ ``EF_AMDGPU_GENERIC_VERSION_V``   0x0100 Value between 1 
and 255 for generic code

t-tye wrote:

There needs to be a selection mask like for other fields.

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Tony Tye via cfe-commits


@@ -4135,6 +4283,33 @@ Code object V5 metadata is the same as
 
  == == = 

 
+.. _amdgpu-amdhsa-code-object-metadata-v6:
+
+Code Object V6 Metadata

+
+.. warning::
+  Code object V6 is not the default code object version emitted by this version
+  of LLVM.
+
+
+Code object V6 metadata is the same as
+:ref:`amdgpu-amdhsa-code-object-metadata-v5` with the changes defined in table
+:ref:`amdgpu-amdhsa-code-object-metadata-map-table-v6`.
+
+  .. table:: AMDHSA Code Object V6 Metadata Map Changes
+ :name: amdgpu-amdhsa-code-object-metadata-map-table-v6
+
+ = == = 
===
+ String KeyValue Type Required? Description
+ = == = 
===
+ "amdhsa.version"  sequence ofRequired  - The first integer is the 
major

t-tye wrote:

I am not sure what metadata changes would be needed to support generic code 
objects. I would not add this section.

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Tony Tye via cfe-commits


@@ -520,6 +520,102 @@ Every processor supports every OS ABI (see 
:ref:`amdgpu-os`) with the following
 
  === ===  = = 
=== === ==
 
+Generic processors allow execution of a single code objects on any of the 
processors that

t-tye wrote:

objects -> object

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Tony Tye via cfe-commits


@@ -520,6 +520,102 @@ Every processor supports every OS ABI (see 
:ref:`amdgpu-os`) with the following
 
  === ===  = = 
=== === ==
 
+Generic processors allow execution of a single code objects on any of the 
processors that
+it supports. Such code objects may not perform as well as those for the 
non-generic processors.
+
+Generic processors are only available on code object V6 and above (see 
:ref:`amdgpu-elf-code-object`).
+
+Generic processor code objects are versioned (see 
:ref:`amdgpu-elf-header-e_flags-table-v6-onwards`).
+The version number is used by runtimes to determine if a code object can be 
run on a specific agent.

t-tye wrote:

This does not really explain how version is used. What about something like:

The version of non-generic code objects is always set to 0.

For a generic code object, adding a new generic member may require the code 
generated for the generic target to be changed so it can continue to execute on 
the previous members as well as on the new member. When this happens the 
generic code object version number is incremented. Each member of the generic 
target has a version when it was introduced. A generic code object can execute 
on a specific member if the version of the code object being loaded is >= the 
version at which the member was introduced.

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Matt Arsenault via cfe-commits


@@ -520,6 +520,104 @@ Every processor supports every OS ABI (see 
:ref:`amdgpu-os`) with the following
 
  === ===  = = 
=== === ==
 
+Generic processors also exist.

arsenm wrote:

Weird way to start this section 

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Matt Arsenault via cfe-commits


@@ -156,6 +156,12 @@ void AMDGPUAsmPrinter::emitFunctionBodyStart() {
   const GCNSubtarget &STM = MF->getSubtarget();
   const Function &F = MF->getFunction();
 
+  // TODO: We're checking this late, would be nice to check it earlier.
+  if (STM.requiresCodeObjectV6() && CodeObjectVersion < AMDGPU::AMDHSA_COV6)

arsenm wrote:

Braces 

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Matt Arsenault via cfe-commits


@@ -139,10 +139,10 @@ bool 
AMDGPURemoveIncompatibleFunctions::checkFunction(Function &F) {
   const GCNSubtarget *ST =
   static_cast(TM->getSubtargetImpl(F));
 
-  // Check the GPU isn't generic. Generic is used for testing only
-  // and we don't want this pass to interfere with it.
+  // Check the GPU isn't generic or generic-hsa. Generic is used for testing
+  // only and we don't want this pass to interfere with it.
   StringRef GPUName = ST->getCPU();
-  if (GPUName.empty() || GPUName.contains("generic"))
+  if (GPUName.empty() || GPUName.starts_with("generic"))

arsenm wrote:

we should probably start refusing to codegen "generic" or whatever none ends up 
getting called 

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Matt Arsenault via cfe-commits


@@ -279,13 +279,25 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions 
&Opts,
   if (GPUKind == llvm::AMDGPU::GK_NONE && !IsHIPHost)
 return;
 
-  StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)
-  : getArchNameR600(GPUKind);
+  std::string CanonName = (isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)

arsenm wrote:

SmallString? 

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Pierre van Houtryve via cfe-commits

https://github.com/Pierre-vh updated 
https://github.com/llvm/llvm-project/pull/76955

>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout 
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH 1/2] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets

These generic targets include multiple GPUs and will, in the future, provide a 
way to build once and run on multiple GPU, at the cost of less optimization 
opportunities.
Note that this is just doing the compiler side of things, device libs an 
runtimes/loader/etc. don't know about these targets yet, so none of them 
actually work in practice right now. This is just the initial commit to make 
LLVM aware of them.

No docs in this patch either as I plan to do it all in a follow-up patch.
---
 clang/lib/Basic/Targets/AMDGPU.cpp|  20 +-
 clang/test/Driver/amdgpu-macros.cl|   5 +
 clang/test/Driver/amdgpu-mcpu.cl  |  10 +
 llvm/docs/AMDGPUUsage.rst | 325 +-
 llvm/include/llvm/BinaryFormat/ELF.h  |   6 +-
 llvm/include/llvm/TargetParser/TargetParser.h |  10 +
 llvm/lib/Object/ELFObjectFile.cpp |  10 +
 llvm/lib/ObjectYAML/ELFYAML.cpp   |   4 +
 llvm/lib/Target/AMDGPU/AMDGPU.td  |  87 +++--
 llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp   |   6 +
 .../AMDGPURemoveIncompatibleFunctions.cpp |   6 +-
 llvm/lib/Target/AMDGPU/GCNProcessors.td   |  22 ++
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |   4 +
 .../MCTargetDesc/AMDGPUTargetStreamer.cpp |  26 ++
 llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h |  11 +
 llvm/lib/TargetParser/TargetParser.cpp|  46 +++
 .../GlobalISel/llvm.amdgcn.workitem.id.ll |   1 +
 .../CodeGen/AMDGPU/directive-amdgcn-target.ll |  14 +
 .../CodeGen/AMDGPU/elf-header-flags-mach.ll   |  10 +
 llvm/test/CodeGen/AMDGPU/gds-allocation.ll|   1 +
 llvm/test/CodeGen/AMDGPU/gds-atomic.ll|   1 +
 .../AMDGPU/generic-targets-require-v6.ll  |  18 +
 .../AMDGPU/hsa-generic-target-features.ll |  31 ++
 .../llvm.amdgcn.image.gather4.d16.dim.ll  |   3 +
 .../AMDGPU/llvm.amdgcn.image.sample.dim.ll|   3 +
 .../AMDGPU/unsupported-image-sample.ll|  12 +-
 .../Object/AMDGPU/elf-header-flags-mach.yaml  |  29 ++
 .../llvm-objdump/ELF/AMDGPU/subtarget.ll  |  20 ++
 .../llvm-readobj/ELF/AMDGPU/elf-headers.test  |  12 +
 llvm/tools/llvm-readobj/ELFDumper.cpp | 128 +++
 30 files changed, 689 insertions(+), 192 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/generic-targets-require-v6.ll
 create mode 100644 llvm/test/CodeGen/AMDGPU/hsa-generic-target-features.ll

diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp 
b/clang/lib/Basic/Targets/AMDGPU.cpp
index 141501e8a4d9a1..799634ccec7ba5 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -279,13 +279,25 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions 
&Opts,
   if (GPUKind == llvm::AMDGPU::GK_NONE && !IsHIPHost)
 return;
 
-  StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)
-  : getArchNameR600(GPUKind);
+  std::string CanonName = (isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)
+ : getArchNameR600(GPUKind))
+  .str();
+
+  // Sanitize the name of generic targets.
+  // e.g. gfx10.1-generic -> gfx10_1_generic
+  if (GPUKind >= llvm::AMDGPU::GK_AMDGCN_GENERIC_FIRST &&
+  GPUKind <= llvm::AMDGPU::GK_AMDGCN_GENERIC_LAST) {
+std::replace(CanonName.begin(), CanonName.end(), '.', '_');
+std::replace(CanonName.begin(), CanonName.end(), '-', '_');
+  }
+
   Builder.defineMacro(Twine("__") + Twine(CanonName) + Twine("__"));
   // Emit macros for gfx family e.g. gfx906 -> __GFX9__, gfx1030 -> __GFX10___
   if (isAMDGCN(getTriple()) && !IsHIPHost) {
-assert(CanonName.starts_with("gfx") && "Invalid amdgcn canonical name");
-Builder.defineMacro(Twine("__") + Twine(CanonName.drop_back(2).upper()) +
+assert(StringRef(CanonName).starts_with("gfx") &&
+   "Invalid amdgcn canonical name");
+StringRef CanonFamilyName = getArchFamilyNameAMDGCN(GPUKind);
+Builder.defineMacro(Twine("__") + Twine(CanonFamilyName.upper()) +
 Twine("__"));
 Builder.defineMacro("__amdgcn_processor__",
 Twine("\"") + Twine(CanonName) + Twine("\""));
diff --git a/clang/test/Driver/amdgpu-macros.cl 
b/clang/test/Driver/amdgpu-macros.cl
index 81c22af460d12d..3b10444ef71d36 100644
--- a/clang/test/Driver/amdgpu-macros.cl
+++ b/clang/test/Driver/amdgpu-macros.cl
@@ -131,6 +131,11 @@
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1200 %s 2>&1 | FileCheck 
--check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1200 
-DFAMILY=GFX12
 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1201 %s 2>&1 | FileCheck 
--check-prefixes=ARCH-GCN,FAST_FMAF

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Pierre van Houtryve via cfe-commits

Pierre-vh wrote:

@arsenm do you have any concerns with this change?
@t-tye is the documentation good?

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [flang] [clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-01 Thread Konstantin Zhuravlyov via cfe-commits

https://github.com/kzhuravl approved this pull request.

LGTM! Thanks!

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [flang] [clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-17 Thread Tony Tye via cfe-commits


@@ -1633,80 +1741,120 @@ The AMDGPU backend uses the following ELF header:
  ``EF_AMDGPU_FEATURE_SRAMECC_ON_V4``  0xc00 SRAMECC enabled.
   = 
===
 
+  .. table:: AMDGPU ELF Header ``e_flags`` for Code Object V6 and After
+ :name: amdgpu-elf-header-e_flags-table-v6-onwards
+
+  == 
=
+ Name Value  Description
+  == 
=
+ ``EF_AMDGPU_MACH``   0x0ff  AMDGPU processor 
selection
+ mask for
+ 
``EF_AMDGPU_MACH_xxx`` values
+ defined in
+ 
:ref:`amdgpu-ef-amdgpu-mach-table`.
+ ``EF_AMDGPU_FEATURE_XNACK_V4``   0x300  XNACK selection 
mask for
+ 
``EF_AMDGPU_FEATURE_XNACK_*_V4``
+ values.
+ ``EF_AMDGPU_FEATURE_XNACK_UNSUPPORTED_V4``   0x000  XNACK unsupported.
+ ``EF_AMDGPU_FEATURE_XNACK_ANY_V4``   0x100  XNACK can have 
any value.
+ ``EF_AMDGPU_FEATURE_XNACK_OFF_V4``   0x200  XNACK disabled.
+ ``EF_AMDGPU_FEATURE_XNACK_ON_V4``0x300  XNACK enabled.
+ ``EF_AMDGPU_FEATURE_SRAMECC_V4`` 0xc00  SRAMECC selection 
mask for
+ 
``EF_AMDGPU_FEATURE_SRAMECC_*_V4``
+ values.
+ ``EF_AMDGPU_FEATURE_SRAMECC_UNSUPPORTED_V4`` 0x000  SRAMECC 
unsupported.
+ ``EF_AMDGPU_FEATURE_SRAMECC_ANY_V4`` 0x400  SRAMECC can have 
any value.
+ ``EF_AMDGPU_FEATURE_SRAMECC_OFF_V4`` 0x800  SRAMECC disabled,
+ ``EF_AMDGPU_FEATURE_SRAMECC_ON_V4``  0xc00  SRAMECC enabled.
+ ``EF_AMDGPU_GENERIC_VERSION_V``   0x0100 The most 
significant byte of EFLAGS
+  to contains a 
"generic code object
+  0xff00 version". This is 
used by runtimes
+ to determine if a 
generic code
+ object can be run 
on a
+ machine.
+ NOTE: This is 
only set for generic
+ targets. (e.g., 
``gfx9-generic``).
+ See 
:ref:`amdgpu-generic-processor-table`
+  == 
=
+
   .. table:: AMDGPU ``EF_AMDGPU_MACH`` Values
  :name: amdgpu-ef-amdgpu-mach-table
 
-  == 
=
- Name Value  Description (see
- 
:ref:`amdgpu-processor-table`)
-  == 
=
- ``EF_AMDGPU_MACH_NONE``  0x000  *not specified*
- ``EF_AMDGPU_MACH_R600_R600`` 0x001  ``r600``
- ``EF_AMDGPU_MACH_R600_R630`` 0x002  ``r630``
- ``EF_AMDGPU_MACH_R600_RS880``0x003  ``rs880``
- ``EF_AMDGPU_MACH_R600_RV670``0x004  ``rv670``
- ``EF_AMDGPU_MACH_R600_RV710``0x005  ``rv710``
- ``EF_AMDGPU_MACH_R600_RV730``0x006  ``rv730``
- ``EF_AMDGPU_MACH_R600_RV770``0x007  ``rv770``
- ``EF_AMDGPU_MACH_R600_CEDAR``0x008  ``cedar``
- ``EF_AMDGPU_MACH_R600_CYPRESS``  0x009  ``cypress``
- ``EF_AMDGPU_MACH_R600_JUNIPER``  0x00a  ``juniper``
- ``EF_AMDGPU_MACH_R600_REDWOOD``  0x00b  ``redwood``
- ``EF_AMDGPU_MACH_R600_SUMO`` 0x00c  ``sumo``
- ``EF_AMDGPU_MACH_R600_BARTS``0x00d  ``barts``
- ``EF_AMDGPU_MACH_R600_CAICOS``   0x00e  ``caicos``
- ``EF_AMDGPU_MACH_R600_CAYMAN``   0x00f  ``cayman``
- ``EF_AMDGPU_MACH_R600_TURKS``0x010  ``turks``
- *reserved*   0x011 -Reserved for ``r600``
-  0x01f  architecture processors.
- ``EF_AMDGPU_MACH_AMDGCN_GFX600`` 0x020  ``gfx600``
- ``EF_AMDGPU_MACH_AMDGCN_GFX601``