[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/Pierre-vh closed https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
Pierre-vh wrote: > mad_mix I added run lines to `mad-mix.ll` and it behaves as expected: no fma/mad_mix emitted https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/t-tye approved this pull request. https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/arsenm commented: I think this needs codegen tests for the gfx900 vs. gfx906 mad_mix/fma_fix issue https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
Pierre-vh wrote: @t-tye Can you please approve then? Otherwise the diff still shows a red "Changes requested" warning :) Thanks @arsenm Please also approve if there are no more comments https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/t-tye commented: Documentation LGTM. Thanks. https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH 1/6] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These generic targets include multiple GPUs and will, in the future, provide a way to build once and run on multiple GPU, at the cost of less optimization opportunities. Note that this is just doing the compiler side of things, device libs an runtimes/loader/etc. don't know about these targets yet, so none of them actually work in practice right now. This is just the initial commit to make LLVM aware of them. No docs in this patch either as I plan to do it all in a follow-up patch. --- clang/lib/Basic/Targets/AMDGPU.cpp| 20 +- clang/test/Driver/amdgpu-macros.cl| 5 + clang/test/Driver/amdgpu-mcpu.cl | 10 + llvm/docs/AMDGPUUsage.rst | 325 +- llvm/include/llvm/BinaryFormat/ELF.h | 6 +- llvm/include/llvm/TargetParser/TargetParser.h | 10 + llvm/lib/Object/ELFObjectFile.cpp | 10 + llvm/lib/ObjectYAML/ELFYAML.cpp | 4 + llvm/lib/Target/AMDGPU/AMDGPU.td | 87 +++-- llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp | 6 + .../AMDGPURemoveIncompatibleFunctions.cpp | 6 +- llvm/lib/Target/AMDGPU/GCNProcessors.td | 22 ++ llvm/lib/Target/AMDGPU/GCNSubtarget.h | 4 + .../MCTargetDesc/AMDGPUTargetStreamer.cpp | 26 ++ llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | 11 + llvm/lib/TargetParser/TargetParser.cpp| 46 +++ .../GlobalISel/llvm.amdgcn.workitem.id.ll | 1 + .../CodeGen/AMDGPU/directive-amdgcn-target.ll | 14 + .../CodeGen/AMDGPU/elf-header-flags-mach.ll | 10 + llvm/test/CodeGen/AMDGPU/gds-allocation.ll| 1 + llvm/test/CodeGen/AMDGPU/gds-atomic.ll| 1 + .../AMDGPU/generic-targets-require-v6.ll | 18 + .../AMDGPU/hsa-generic-target-features.ll | 31 ++ .../llvm.amdgcn.image.gather4.d16.dim.ll | 3 + .../AMDGPU/llvm.amdgcn.image.sample.dim.ll| 3 + .../AMDGPU/unsupported-image-sample.ll| 12 +- .../Object/AMDGPU/elf-header-flags-mach.yaml | 29 ++ .../llvm-objdump/ELF/AMDGPU/subtarget.ll | 20 ++ .../llvm-readobj/ELF/AMDGPU/elf-headers.test | 12 + llvm/tools/llvm-readobj/ELFDumper.cpp | 128 +++ 30 files changed, 689 insertions(+), 192 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/generic-targets-require-v6.ll create mode 100644 llvm/test/CodeGen/AMDGPU/hsa-generic-target-features.ll diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp index 141501e8a4d9a1..799634ccec7ba5 100644 --- a/clang/lib/Basic/Targets/AMDGPU.cpp +++ b/clang/lib/Basic/Targets/AMDGPU.cpp @@ -279,13 +279,25 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts, if (GPUKind == llvm::AMDGPU::GK_NONE && !IsHIPHost) return; - StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind) - : getArchNameR600(GPUKind); + std::string CanonName = (isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind) + : getArchNameR600(GPUKind)) + .str(); + + // Sanitize the name of generic targets. + // e.g. gfx10.1-generic -> gfx10_1_generic + if (GPUKind >= llvm::AMDGPU::GK_AMDGCN_GENERIC_FIRST && + GPUKind <= llvm::AMDGPU::GK_AMDGCN_GENERIC_LAST) { +std::replace(CanonName.begin(), CanonName.end(), '.', '_'); +std::replace(CanonName.begin(), CanonName.end(), '-', '_'); + } + Builder.defineMacro(Twine("__") + Twine(CanonName) + Twine("__")); // Emit macros for gfx family e.g. gfx906 -> __GFX9__, gfx1030 -> __GFX10___ if (isAMDGCN(getTriple()) && !IsHIPHost) { -assert(CanonName.starts_with("gfx") && "Invalid amdgcn canonical name"); -Builder.defineMacro(Twine("__") + Twine(CanonName.drop_back(2).upper()) + +assert(StringRef(CanonName).starts_with("gfx") && + "Invalid amdgcn canonical name"); +StringRef CanonFamilyName = getArchFamilyNameAMDGCN(GPUKind); +Builder.defineMacro(Twine("__") + Twine(CanonFamilyName.upper()) + Twine("__")); Builder.defineMacro("__amdgcn_processor__", Twine("\"") + Twine(CanonName) + Twine("\"")); diff --git a/clang/test/Driver/amdgpu-macros.cl b/clang/test/Driver/amdgpu-macros.cl index 81c22af460d12d..3b10444ef71d36 100644 --- a/clang/test/Driver/amdgpu-macros.cl +++ b/clang/test/Driver/amdgpu-macros.cl @@ -131,6 +131,11 @@ // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1200 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1200 -DFAMILY=GFX12 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1201 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -520,6 +520,102 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following === === = = === === == +Generic processors allow execution of a single code objects on any of the processors that +it supports. Such code objects may not perform as well as those for the non-generic processors. + +Generic processors are only available on code object V6 and above (see :ref:`amdgpu-elf-code-object`). + +Generic processor code objects are versioned (see :ref:`amdgpu-elf-header-e_flags-table-v6-onwards`). +The version number is used by runtimes to determine if a code object can be run on a specific agent. Pierre-vh wrote: I rephrased it a bit (e.g. member -> supported processor) but I mostly followed your suggestion https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH 1/5] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These generic targets include multiple GPUs and will, in the future, provide a way to build once and run on multiple GPU, at the cost of less optimization opportunities. Note that this is just doing the compiler side of things, device libs an runtimes/loader/etc. don't know about these targets yet, so none of them actually work in practice right now. This is just the initial commit to make LLVM aware of them. No docs in this patch either as I plan to do it all in a follow-up patch. --- clang/lib/Basic/Targets/AMDGPU.cpp| 20 +- clang/test/Driver/amdgpu-macros.cl| 5 + clang/test/Driver/amdgpu-mcpu.cl | 10 + llvm/docs/AMDGPUUsage.rst | 325 +- llvm/include/llvm/BinaryFormat/ELF.h | 6 +- llvm/include/llvm/TargetParser/TargetParser.h | 10 + llvm/lib/Object/ELFObjectFile.cpp | 10 + llvm/lib/ObjectYAML/ELFYAML.cpp | 4 + llvm/lib/Target/AMDGPU/AMDGPU.td | 87 +++-- llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp | 6 + .../AMDGPURemoveIncompatibleFunctions.cpp | 6 +- llvm/lib/Target/AMDGPU/GCNProcessors.td | 22 ++ llvm/lib/Target/AMDGPU/GCNSubtarget.h | 4 + .../MCTargetDesc/AMDGPUTargetStreamer.cpp | 26 ++ llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | 11 + llvm/lib/TargetParser/TargetParser.cpp| 46 +++ .../GlobalISel/llvm.amdgcn.workitem.id.ll | 1 + .../CodeGen/AMDGPU/directive-amdgcn-target.ll | 14 + .../CodeGen/AMDGPU/elf-header-flags-mach.ll | 10 + llvm/test/CodeGen/AMDGPU/gds-allocation.ll| 1 + llvm/test/CodeGen/AMDGPU/gds-atomic.ll| 1 + .../AMDGPU/generic-targets-require-v6.ll | 18 + .../AMDGPU/hsa-generic-target-features.ll | 31 ++ .../llvm.amdgcn.image.gather4.d16.dim.ll | 3 + .../AMDGPU/llvm.amdgcn.image.sample.dim.ll| 3 + .../AMDGPU/unsupported-image-sample.ll| 12 +- .../Object/AMDGPU/elf-header-flags-mach.yaml | 29 ++ .../llvm-objdump/ELF/AMDGPU/subtarget.ll | 20 ++ .../llvm-readobj/ELF/AMDGPU/elf-headers.test | 12 + llvm/tools/llvm-readobj/ELFDumper.cpp | 128 +++ 30 files changed, 689 insertions(+), 192 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/generic-targets-require-v6.ll create mode 100644 llvm/test/CodeGen/AMDGPU/hsa-generic-target-features.ll diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp index 141501e8a4d9a..799634ccec7ba 100644 --- a/clang/lib/Basic/Targets/AMDGPU.cpp +++ b/clang/lib/Basic/Targets/AMDGPU.cpp @@ -279,13 +279,25 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts, if (GPUKind == llvm::AMDGPU::GK_NONE && !IsHIPHost) return; - StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind) - : getArchNameR600(GPUKind); + std::string CanonName = (isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind) + : getArchNameR600(GPUKind)) + .str(); + + // Sanitize the name of generic targets. + // e.g. gfx10.1-generic -> gfx10_1_generic + if (GPUKind >= llvm::AMDGPU::GK_AMDGCN_GENERIC_FIRST && + GPUKind <= llvm::AMDGPU::GK_AMDGCN_GENERIC_LAST) { +std::replace(CanonName.begin(), CanonName.end(), '.', '_'); +std::replace(CanonName.begin(), CanonName.end(), '-', '_'); + } + Builder.defineMacro(Twine("__") + Twine(CanonName) + Twine("__")); // Emit macros for gfx family e.g. gfx906 -> __GFX9__, gfx1030 -> __GFX10___ if (isAMDGCN(getTriple()) && !IsHIPHost) { -assert(CanonName.starts_with("gfx") && "Invalid amdgcn canonical name"); -Builder.defineMacro(Twine("__") + Twine(CanonName.drop_back(2).upper()) + +assert(StringRef(CanonName).starts_with("gfx") && + "Invalid amdgcn canonical name"); +StringRef CanonFamilyName = getArchFamilyNameAMDGCN(GPUKind); +Builder.defineMacro(Twine("__") + Twine(CanonFamilyName.upper()) + Twine("__")); Builder.defineMacro("__amdgcn_processor__", Twine("\"") + Twine(CanonName) + Twine("\"")); diff --git a/clang/test/Driver/amdgpu-macros.cl b/clang/test/Driver/amdgpu-macros.cl index 81c22af460d12..3b10444ef71d3 100644 --- a/clang/test/Driver/amdgpu-macros.cl +++ b/clang/test/Driver/amdgpu-macros.cl @@ -131,6 +131,11 @@ // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1200 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1200 -DFAMILY=GFX12 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1201 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
Pierre-vh wrote: For the MD changes, it's just to describe the version increment, nothing else. I think describing is important as the V6 diff already updated the amdhsa.version. If amdhsa.version didn't need to change then i need to fix that first, and then we can remove the V6 MD section https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -1642,80 +1746,118 @@ The AMDGPU backend uses the following ELF header: ``EF_AMDGPU_FEATURE_SRAMECC_ON_V4`` 0xc00 SRAMECC enabled. = === + .. table:: AMDGPU ELF Header ``e_flags`` for Code Object V6 and After + :name: amdgpu-elf-header-e_flags-table-v6-onwards + + == = + Name Value Description + == = + ``EF_AMDGPU_MACH`` 0x0ff AMDGPU processor selection + mask for + ``EF_AMDGPU_MACH_xxx`` values + defined in + :ref:`amdgpu-ef-amdgpu-mach-table`. + ``EF_AMDGPU_FEATURE_XNACK_V4`` 0x300 XNACK selection mask for + ``EF_AMDGPU_FEATURE_XNACK_*_V4`` + values. + ``EF_AMDGPU_FEATURE_XNACK_UNSUPPORTED_V4`` 0x000 XNACK unsupported. + ``EF_AMDGPU_FEATURE_XNACK_ANY_V4`` 0x100 XNACK can have any value. + ``EF_AMDGPU_FEATURE_XNACK_OFF_V4`` 0x200 XNACK disabled. + ``EF_AMDGPU_FEATURE_XNACK_ON_V4``0x300 XNACK enabled. + ``EF_AMDGPU_FEATURE_SRAMECC_V4`` 0xc00 SRAMECC selection mask for + ``EF_AMDGPU_FEATURE_SRAMECC_*_V4`` + values. + ``EF_AMDGPU_FEATURE_SRAMECC_UNSUPPORTED_V4`` 0x000 SRAMECC unsupported. + ``EF_AMDGPU_FEATURE_SRAMECC_ANY_V4`` 0x400 SRAMECC can have any value. + ``EF_AMDGPU_FEATURE_SRAMECC_OFF_V4`` 0x800 SRAMECC disabled, + ``EF_AMDGPU_FEATURE_SRAMECC_ON_V4`` 0xc00 SRAMECC enabled. + ``EF_AMDGPU_GENERIC_VERSION_V`` 0x0100 Value between 1 and 255 for generic code t-tye wrote: There needs to be a selection mask like for other fields. https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -4135,6 +4283,33 @@ Code object V5 metadata is the same as == == = +.. _amdgpu-amdhsa-code-object-metadata-v6: + +Code Object V6 Metadata + +.. warning:: + Code object V6 is not the default code object version emitted by this version + of LLVM. + + +Code object V6 metadata is the same as +:ref:`amdgpu-amdhsa-code-object-metadata-v5` with the changes defined in table +:ref:`amdgpu-amdhsa-code-object-metadata-map-table-v6`. + + .. table:: AMDHSA Code Object V6 Metadata Map Changes + :name: amdgpu-amdhsa-code-object-metadata-map-table-v6 + + = == = === + String KeyValue Type Required? Description + = == = === + "amdhsa.version" sequence ofRequired - The first integer is the major t-tye wrote: I am not sure what metadata changes would be needed to support generic code objects. I would not add this section. https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -520,6 +520,102 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following === === = = === === == +Generic processors allow execution of a single code objects on any of the processors that t-tye wrote: objects -> object https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -520,6 +520,102 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following === === = = === === == +Generic processors allow execution of a single code objects on any of the processors that +it supports. Such code objects may not perform as well as those for the non-generic processors. + +Generic processors are only available on code object V6 and above (see :ref:`amdgpu-elf-code-object`). + +Generic processor code objects are versioned (see :ref:`amdgpu-elf-header-e_flags-table-v6-onwards`). +The version number is used by runtimes to determine if a code object can be run on a specific agent. t-tye wrote: This does not really explain how version is used. What about something like: The version of non-generic code objects is always set to 0. For a generic code object, adding a new generic member may require the code generated for the generic target to be changed so it can continue to execute on the previous members as well as on the new member. When this happens the generic code object version number is incremented. Each member of the generic target has a version when it was introduced. A generic code object can execute on a specific member if the version of the code object being loaded is >= the version at which the member was introduced. https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -520,6 +520,104 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following === === = = === === == +Generic processors also exist. arsenm wrote: Weird way to start this section https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -156,6 +156,12 @@ void AMDGPUAsmPrinter::emitFunctionBodyStart() { const GCNSubtarget &STM = MF->getSubtarget(); const Function &F = MF->getFunction(); + // TODO: We're checking this late, would be nice to check it earlier. + if (STM.requiresCodeObjectV6() && CodeObjectVersion < AMDGPU::AMDHSA_COV6) arsenm wrote: Braces https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -139,10 +139,10 @@ bool AMDGPURemoveIncompatibleFunctions::checkFunction(Function &F) { const GCNSubtarget *ST = static_cast(TM->getSubtargetImpl(F)); - // Check the GPU isn't generic. Generic is used for testing only - // and we don't want this pass to interfere with it. + // Check the GPU isn't generic or generic-hsa. Generic is used for testing + // only and we don't want this pass to interfere with it. StringRef GPUName = ST->getCPU(); - if (GPUName.empty() || GPUName.contains("generic")) + if (GPUName.empty() || GPUName.starts_with("generic")) arsenm wrote: we should probably start refusing to codegen "generic" or whatever none ends up getting called https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -279,13 +279,25 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts, if (GPUKind == llvm::AMDGPU::GK_NONE && !IsHIPHost) return; - StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind) - : getArchNameR600(GPUKind); + std::string CanonName = (isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind) arsenm wrote: SmallString? https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH 1/2] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These generic targets include multiple GPUs and will, in the future, provide a way to build once and run on multiple GPU, at the cost of less optimization opportunities. Note that this is just doing the compiler side of things, device libs an runtimes/loader/etc. don't know about these targets yet, so none of them actually work in practice right now. This is just the initial commit to make LLVM aware of them. No docs in this patch either as I plan to do it all in a follow-up patch. --- clang/lib/Basic/Targets/AMDGPU.cpp| 20 +- clang/test/Driver/amdgpu-macros.cl| 5 + clang/test/Driver/amdgpu-mcpu.cl | 10 + llvm/docs/AMDGPUUsage.rst | 325 +- llvm/include/llvm/BinaryFormat/ELF.h | 6 +- llvm/include/llvm/TargetParser/TargetParser.h | 10 + llvm/lib/Object/ELFObjectFile.cpp | 10 + llvm/lib/ObjectYAML/ELFYAML.cpp | 4 + llvm/lib/Target/AMDGPU/AMDGPU.td | 87 +++-- llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp | 6 + .../AMDGPURemoveIncompatibleFunctions.cpp | 6 +- llvm/lib/Target/AMDGPU/GCNProcessors.td | 22 ++ llvm/lib/Target/AMDGPU/GCNSubtarget.h | 4 + .../MCTargetDesc/AMDGPUTargetStreamer.cpp | 26 ++ llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | 11 + llvm/lib/TargetParser/TargetParser.cpp| 46 +++ .../GlobalISel/llvm.amdgcn.workitem.id.ll | 1 + .../CodeGen/AMDGPU/directive-amdgcn-target.ll | 14 + .../CodeGen/AMDGPU/elf-header-flags-mach.ll | 10 + llvm/test/CodeGen/AMDGPU/gds-allocation.ll| 1 + llvm/test/CodeGen/AMDGPU/gds-atomic.ll| 1 + .../AMDGPU/generic-targets-require-v6.ll | 18 + .../AMDGPU/hsa-generic-target-features.ll | 31 ++ .../llvm.amdgcn.image.gather4.d16.dim.ll | 3 + .../AMDGPU/llvm.amdgcn.image.sample.dim.ll| 3 + .../AMDGPU/unsupported-image-sample.ll| 12 +- .../Object/AMDGPU/elf-header-flags-mach.yaml | 29 ++ .../llvm-objdump/ELF/AMDGPU/subtarget.ll | 20 ++ .../llvm-readobj/ELF/AMDGPU/elf-headers.test | 12 + llvm/tools/llvm-readobj/ELFDumper.cpp | 128 +++ 30 files changed, 689 insertions(+), 192 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/generic-targets-require-v6.ll create mode 100644 llvm/test/CodeGen/AMDGPU/hsa-generic-target-features.ll diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp index 141501e8a4d9a1..799634ccec7ba5 100644 --- a/clang/lib/Basic/Targets/AMDGPU.cpp +++ b/clang/lib/Basic/Targets/AMDGPU.cpp @@ -279,13 +279,25 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts, if (GPUKind == llvm::AMDGPU::GK_NONE && !IsHIPHost) return; - StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind) - : getArchNameR600(GPUKind); + std::string CanonName = (isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind) + : getArchNameR600(GPUKind)) + .str(); + + // Sanitize the name of generic targets. + // e.g. gfx10.1-generic -> gfx10_1_generic + if (GPUKind >= llvm::AMDGPU::GK_AMDGCN_GENERIC_FIRST && + GPUKind <= llvm::AMDGPU::GK_AMDGCN_GENERIC_LAST) { +std::replace(CanonName.begin(), CanonName.end(), '.', '_'); +std::replace(CanonName.begin(), CanonName.end(), '-', '_'); + } + Builder.defineMacro(Twine("__") + Twine(CanonName) + Twine("__")); // Emit macros for gfx family e.g. gfx906 -> __GFX9__, gfx1030 -> __GFX10___ if (isAMDGCN(getTriple()) && !IsHIPHost) { -assert(CanonName.starts_with("gfx") && "Invalid amdgcn canonical name"); -Builder.defineMacro(Twine("__") + Twine(CanonName.drop_back(2).upper()) + +assert(StringRef(CanonName).starts_with("gfx") && + "Invalid amdgcn canonical name"); +StringRef CanonFamilyName = getArchFamilyNameAMDGCN(GPUKind); +Builder.defineMacro(Twine("__") + Twine(CanonFamilyName.upper()) + Twine("__")); Builder.defineMacro("__amdgcn_processor__", Twine("\"") + Twine(CanonName) + Twine("\"")); diff --git a/clang/test/Driver/amdgpu-macros.cl b/clang/test/Driver/amdgpu-macros.cl index 81c22af460d12d..3b10444ef71d36 100644 --- a/clang/test/Driver/amdgpu-macros.cl +++ b/clang/test/Driver/amdgpu-macros.cl @@ -131,6 +131,11 @@ // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1200 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1200 -DFAMILY=GFX12 // RUN: %clang -E -dM -target amdgcn -mcpu=gfx1201 %s 2>&1 | FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF
[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
Pierre-vh wrote: @arsenm do you have any concerns with this change? @t-tye is the documentation good? https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[lld] [flang] [clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
https://github.com/kzhuravl approved this pull request. LGTM! Thanks! https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[lld] [flang] [clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)
@@ -1633,80 +1741,120 @@ The AMDGPU backend uses the following ELF header: ``EF_AMDGPU_FEATURE_SRAMECC_ON_V4`` 0xc00 SRAMECC enabled. = === + .. table:: AMDGPU ELF Header ``e_flags`` for Code Object V6 and After + :name: amdgpu-elf-header-e_flags-table-v6-onwards + + == = + Name Value Description + == = + ``EF_AMDGPU_MACH`` 0x0ff AMDGPU processor selection + mask for + ``EF_AMDGPU_MACH_xxx`` values + defined in + :ref:`amdgpu-ef-amdgpu-mach-table`. + ``EF_AMDGPU_FEATURE_XNACK_V4`` 0x300 XNACK selection mask for + ``EF_AMDGPU_FEATURE_XNACK_*_V4`` + values. + ``EF_AMDGPU_FEATURE_XNACK_UNSUPPORTED_V4`` 0x000 XNACK unsupported. + ``EF_AMDGPU_FEATURE_XNACK_ANY_V4`` 0x100 XNACK can have any value. + ``EF_AMDGPU_FEATURE_XNACK_OFF_V4`` 0x200 XNACK disabled. + ``EF_AMDGPU_FEATURE_XNACK_ON_V4``0x300 XNACK enabled. + ``EF_AMDGPU_FEATURE_SRAMECC_V4`` 0xc00 SRAMECC selection mask for + ``EF_AMDGPU_FEATURE_SRAMECC_*_V4`` + values. + ``EF_AMDGPU_FEATURE_SRAMECC_UNSUPPORTED_V4`` 0x000 SRAMECC unsupported. + ``EF_AMDGPU_FEATURE_SRAMECC_ANY_V4`` 0x400 SRAMECC can have any value. + ``EF_AMDGPU_FEATURE_SRAMECC_OFF_V4`` 0x800 SRAMECC disabled, + ``EF_AMDGPU_FEATURE_SRAMECC_ON_V4`` 0xc00 SRAMECC enabled. + ``EF_AMDGPU_GENERIC_VERSION_V`` 0x0100 The most significant byte of EFLAGS + to contains a "generic code object + 0xff00 version". This is used by runtimes + to determine if a generic code + object can be run on a + machine. + NOTE: This is only set for generic + targets. (e.g., ``gfx9-generic``). + See :ref:`amdgpu-generic-processor-table` + == = + .. table:: AMDGPU ``EF_AMDGPU_MACH`` Values :name: amdgpu-ef-amdgpu-mach-table - == = - Name Value Description (see - :ref:`amdgpu-processor-table`) - == = - ``EF_AMDGPU_MACH_NONE`` 0x000 *not specified* - ``EF_AMDGPU_MACH_R600_R600`` 0x001 ``r600`` - ``EF_AMDGPU_MACH_R600_R630`` 0x002 ``r630`` - ``EF_AMDGPU_MACH_R600_RS880``0x003 ``rs880`` - ``EF_AMDGPU_MACH_R600_RV670``0x004 ``rv670`` - ``EF_AMDGPU_MACH_R600_RV710``0x005 ``rv710`` - ``EF_AMDGPU_MACH_R600_RV730``0x006 ``rv730`` - ``EF_AMDGPU_MACH_R600_RV770``0x007 ``rv770`` - ``EF_AMDGPU_MACH_R600_CEDAR``0x008 ``cedar`` - ``EF_AMDGPU_MACH_R600_CYPRESS`` 0x009 ``cypress`` - ``EF_AMDGPU_MACH_R600_JUNIPER`` 0x00a ``juniper`` - ``EF_AMDGPU_MACH_R600_REDWOOD`` 0x00b ``redwood`` - ``EF_AMDGPU_MACH_R600_SUMO`` 0x00c ``sumo`` - ``EF_AMDGPU_MACH_R600_BARTS``0x00d ``barts`` - ``EF_AMDGPU_MACH_R600_CAICOS`` 0x00e ``caicos`` - ``EF_AMDGPU_MACH_R600_CAYMAN`` 0x00f ``cayman`` - ``EF_AMDGPU_MACH_R600_TURKS``0x010 ``turks`` - *reserved* 0x011 -Reserved for ``r600`` - 0x01f architecture processors. - ``EF_AMDGPU_MACH_AMDGCN_GFX600`` 0x020 ``gfx600`` - ``EF_AMDGPU_MACH_AMDGCN_GFX601``