llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT--> @llvm/pr-subscribers-backend-amdgpu Author: Fabian Ritter (ritter-x2a) <details> <summary>Changes</summary> gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all documentation occurrences of gfx940/gfx941 except for the gfx940 ISA description, which will be the subject of a separate PR. For SWDEV-512631 --- Patch is 24.54 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126887.diff 2 Files Affected: - (modified) llvm/docs/AMDGPUOperandSyntax.rst (+2-2) - (modified) llvm/docs/AMDGPUUsage.rst (+32-65) ``````````diff diff --git a/llvm/docs/AMDGPUOperandSyntax.rst b/llvm/docs/AMDGPUOperandSyntax.rst index ff6ec6cf71ff2..e8a76322fe76a 100644 --- a/llvm/docs/AMDGPUOperandSyntax.rst +++ b/llvm/docs/AMDGPUOperandSyntax.rst @@ -63,7 +63,7 @@ Note: *N* and *K* must satisfy the following conditions: * 0 <= *K* <= 255. * *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32. -GFX90A and GFX940 have an additional alignment requirement: +GFX90A and GFX942 have an additional alignment requirement: pairs of *vector* registers must be even-aligned (first register must be even). @@ -183,7 +183,7 @@ Note: *N* and *K* must satisfy the following conditions: * 0 <= *K* <= 255. * *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32. -GFX90A and GFX940 have an additional alignment requirement: +GFX90A and GFX942 have an additional alignment requirement: pairs of *accumulator* registers must be even-aligned (first register must be even). diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst index 83ec1eecb6e5e..14b3b6fce9e70 100644 --- a/llvm/docs/AMDGPUUsage.rst +++ b/llvm/docs/AMDGPUUsage.rst @@ -323,7 +323,7 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following Add product names. - **GCN GFX9 (Vega)** [AMD-GCN-GFX900-GFX904-VEGA]_ [AMD-GCN-GFX906-VEGA7NM]_ [AMD-GCN-GFX908-CDNA1]_ [AMD-GCN-GFX90A-CDNA2]_ [AMD-GCN-GFX940-GFX942-CDNA3]_ + **GCN GFX9 (Vega)** [AMD-GCN-GFX900-GFX904-VEGA]_ [AMD-GCN-GFX906-VEGA7NM]_ [AMD-GCN-GFX908-CDNA1]_ [AMD-GCN-GFX90A-CDNA2]_ [AMD-GCN-GFX942-CDNA3]_ ----------------------------------------------------------------------------------------------------------------------- ``gfx900`` ``amdgcn`` dGPU - xnack - Absolute - *rocm-amdhsa* - Radeon Vega flat - *pal-amdhsa* Frontier Edition @@ -378,20 +378,6 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following - Ryzen 3 Pro 4350G - Ryzen 3 Pro 4350GE - ``gfx940`` ``amdgcn`` dGPU - sramecc - Architected *TBA* - - tgsplit flat - - xnack scratch .. TODO:: - - kernarg preload - Packed - work-item Add product - IDs names. - - ``gfx941`` ``amdgcn`` dGPU - sramecc - Architected *TBA* - - tgsplit flat - - xnack scratch .. TODO:: - - kernarg preload - Packed - work-item Add product - IDs names. - ``gfx942`` ``amdgcn`` dGPU - sramecc - Architected - AMD Instinct MI300X - tgsplit flat - AMD Instinct MI300A - xnack scratch @@ -583,10 +569,10 @@ Generic processor code objects are versioned. See :ref:`amdgpu-generic-processor - ``v_dot2_f32_f16`` - ``gfx9-4-generic`` ``amdgcn`` - ``gfx940`` - sramecc - Architected FP8 and BF8 instructions, - - ``gfx941`` - tgsplit flat scratch FP8 and BF8 conversion - - ``gfx942`` - xnack - Packed instructions, as well as - - ``gfx950`` - kernarg preload work-item instructions with XF32 format + ``gfx9-4-generic`` ``amdgcn`` - ``gfx942`` - sramecc - Architected FP8 and BF8 instructions, + - ``gfx950`` - tgsplit flat scratch FP8 and BF8 conversion + - xnack - Packed instructions, as well as + - kernarg preload work-item instructions with XF32 format IDs support are not available. ``gfx10-1-generic`` ``amdgcn`` - ``gfx1010`` - xnack - Absolute flat - The following instructions are @@ -4974,7 +4960,7 @@ The fields used by CP for code objects before V3 also match those specified in bytes 383:352 4 bytes COMPUTE_PGM_RSRC3 GFX6-GFX9 Reserved, must be 0. - GFX90A, GFX940 + GFX90A, GFX942 Compute Shader (CS) program settings used by CP to set up @@ -5059,7 +5045,7 @@ The fields used by CP for code objects before V3 also match those specified in 463:460 4 bits Reserved, must be 0. 470:464 7 bits KERNARG_PRELOAD_SPEC_LENGTH GFX6-GFX9 - Reserved, must be 0. - GFX90A, GFX940 + GFX90A, GFX942 - The number of dwords from the kernarg segment to preload into User SGPRs before kernel @@ -5067,7 +5053,7 @@ The fields used by CP for code objects before V3 also match those specified in :ref:`amdgpu-amdhsa-kernarg-preload`). 479:471 9 bits KERNARG_PRELOAD_SPEC_OFFSET GFX6-GFX9 - Reserved, must be 0. - GFX90A, GFX940 + GFX90A, GFX942 - An offset in dwords into the kernarg segment to begin preloading data into User @@ -5093,7 +5079,7 @@ The fields used by CP for code objects before V3 also match those specified in GFX6-GFX9 - vgprs_used 0..256 - max(0, ceil(vgprs_used / 4) - 1) - GFX90A, GFX940 + GFX90A, GFX942 - vgprs_used 0..512 - vgprs_used = align(arch_vgprs, 4) + acc_vgprs @@ -5559,7 +5545,7 @@ The fields used by CP for code objects before V3 also match those specified in .. - .. table:: compute_pgm_rsrc3 for GFX90A, GFX940 + .. table:: compute_pgm_rsrc3 for GFX90A, GFX942 :name: amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table ======= ======= =============================== =========================================================================== @@ -9970,15 +9956,15 @@ only accessed by a single thread, and is always write-before-read, there is never a need to invalidate these entries from the L1 cache. Hence all cache invalidates are done as ``*_vol`` to only invalidate the volatile cache lines. -The code sequences used to implement the memory model for GFX940, GFX941, GFX942 -are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx941-gfx942-table`. +The code sequences used to implement the memory model for GFX942 are defined in +table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx942-table`. - .. table:: AMDHSA Memory Model Code Sequences GFX940, GFX941, GFX942 - :name: amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx941-gfx942-table + .. table:: AMDHSA Memory Model Code Sequences GFX942 + :name: amdgpu-amdhsa-memory-model-code-sequences-gfx942-table ============ ============ ============== ========== ================================ LLVM Instr LLVM Memory LLVM Memory AMDGPU AMDGPU Machine Code - Ordering Sync Scope Address GFX940, GFX941, GFX942 + Ordering Sync Scope Address GFX942 Space ============ ============ ============== ========== ================================ **Non-Atomic** @@ -10013,18 +9999,12 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9 load *none* *none* - local 1. ds_load store *none* *none* - global - !volatile & !nontemporal - generic - - private 1. GFX940, GFX941 + - private 1. GFX942 - constant buffer/global/flat_store - sc0=1 sc1=1 - GFX942 - buffer/global/flat_store - !volatile & nontemporal - 1. GFX940, GFX941 - buffer/global/flat_store - nt=1 sc0=1 sc1=1 - GFX942 + 1. GFX942 buffer/global/flat_store nt=1 @@ -10696,11 +10676,8 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9 **Release Atomic** ------------------------------------------------------------------------------------ - store atomic release - singlethread - global 1. GFX940, GFX941 + store atomic release - singlethread - global 1. GFX942 - wavefront - generic buffer/global/flat_store - sc0=1 sc1=1 - GFX942 - buffer/global/flat_store store atomic release - singlethread - local *If TgSplit execution mode, - wavefront local address space cannot @@ -10738,10 +10715,7 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9 store that is being released. - 2. GFX940, GFX941 - buffer/global/flat_store - sc0=1 sc1=1 - GFX942 + 2. GFX942 buffer/global/flat_store sc0=1 store atomic release - workgroup - local *If TgSplit execution mode, @@ -10802,10 +10776,7 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9 store that is being released. - 3. GFX940, GFX941 - buffer/global/flat_store - sc0=1 sc1=1 - GFX942 + 3. GFX942 buffer/global/flat_store sc1=1 store atomic release - system - global 1. buffer_wbl2 sc0=1 sc1=1 @@ -17563,11 +17534,7 @@ in this description. CDNA 2 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx90a<AMDGPU/AMDGPUAsmGFX90a>` - CDNA 3 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx940<AMDGPU/AMDGPUAsmGFX940>` - - :doc:`gfx941<AMDGPU/AMDGPUAsmGFX940>` - - :doc:`gfx942<AMDGPU/AMDGPUAsmGFX940>` + CDNA 3 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx942<AMDGPU/AMDGPUAsmGFX940>` RDNA 1 :doc:`GFX10 RDNA1<AMDGPU/AMDGPUAsmGFX10>` :doc:`gfx1010<AMDGPU/AMDGPUAsmGFX10>` @@ -17605,7 +17572,7 @@ combinations of operands, refer to one of instruction set architecture manuals [AMD-GCN-GFX6]_, [AMD-GCN-GFX7]_, [AMD-GCN-GFX8]_, [AMD-GCN-GFX900-GFX904-VEGA]_, [AMD-GCN-GFX906-VEGA7NM]_, [AMD-GCN-GFX908-CDNA1]_, [AMD-GCN-GFX90A-CDNA2]_, -[AMD-GCN-GFX940-GFX942-CDNA3]_, [AMD-GCN-GFX10-RDNA1]_, [AMD-GCN-GFX10-RDNA2]_, +[AMD-GCN-GFX942-CDNA3]_, [AMD-GCN-GFX10-RDNA1]_, [AMD-GCN-GFX10-RDNA2]_, [AMD-GCN-GFX11-RDNA3]_ and [AMD-GCN-GFX11-RDNA3.5]_. Operands @@ -18118,7 +18085,7 @@ terminated by an ``.end_amdhsa_kernel`` directive. :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table` ``.amdhsa_user_sgpr_private_segment_buffer`` 0 GFX6-GFX10 Controls ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER in (except :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`. - GFX940) + GFX942) ``.amdhsa_user_sgpr_dispatch_ptr`` 0 GFX6-GFX12 Controls ENABLE_SGPR_DISPATCH_PTR in :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`. ``.amdhsa_user_sgpr_queue_ptr`` 0 GFX6-GFX12 Controls ENABLE_SGPR_QUEUE_PTR in @@ -18129,7 +18096,7 @@ terminated by an ``.end_amdhsa_kernel`` directive. :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`. ``.amdhsa_user_sgpr_flat_scratch_init`` 0 GFX6-GFX10 Controls ENABLE_SGPR_FLAT_SCRATCH_INIT in (except :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`. - GFX940) + GFX942) ``.amdhsa_user_sgpr_private_segment_size`` 0 GFX6-GFX12 Controls ENABLE_SGPR_PRIVATE_SEGMENT_SIZE in :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`. ``.amdhsa_wavefront_size32`` Target GFX10-GFX12 Controls ENABLE_WAVEFRONT_SIZE32 in @@ -18140,8 +18107,8 @@ terminated by an ``.end_amdhsa_kernel`` directive. :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`. ``.amdhsa_system_sgpr_private_segment_wavefront_offset`` 0 GFX6-GFX10 Controls ENABLE_PRIVATE_SEGMENT in (except :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`. - GFX940) - ``.amdhsa_enable_private_segment`` 0 GFX940, Controls ENABLE_PRIVATE_SEGMENT in + GFX942) + ``.amdhsa_enable_private_segment`` 0 GFX942, Controls ENABLE_PRIVATE_SEGMENT in GFX11-GFX12 :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`. ``.amdhsa_system_sgpr_workgroup_id_x`` 1 GFX6-GFX12 Controls ENABLE_SGPR_WORKGROUP_ID_X in :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`. @@ -18162,14 +18129,14 @@ terminated by an ``.end_amdhsa_kernel`` directive. Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`. ``.amdhsa_accum_offset`` Required GFX90A, Offset of a first AccVGPR in the unified register file. - GFX940 Used to calculate ACCUM_OFFSET in + GFX942 Used to calculate ACCUM_OFFSET in :ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`. ``.amdhsa_reserve_vcc`` 1 GFX6-GFX12 Whether the kernel may use the special VCC SGPR. Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`. ``.amdhsa_reserve_flat_scratch`` 1 GFX7-GFX10 Whether the kernel may use flat instructions to access ... [truncated] `````````` </details> https://github.com/llvm/llvm-project/pull/126887 _______________________________________________ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits