[llvm-branch-commits] [llvm] [AMDGPU][docs] Replace gfx940 and gfx941 with gfx942 in llvm/docs (PR #126887)

Fabian Ritter via llvm-branch-commits Wed, 12 Feb 2025 02:51:46 -0800

https://github.com/ritter-x2a created 
https://github.com/llvm/llvm-project/pull/126887


gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all documentation occurrences of gfx940/gfx941 except
for the gfx940 ISA description, which will be the subject of a separate
PR.

For SWDEV-512631

>From 12460f083060a7e330975a7354bdc019f670f386 Mon Sep 17 00:00:00 2001
From: Fabian Ritter <fabian.rit...@amd.com>
Date: Wed, 12 Feb 2025 05:45:01 -0500
Subject: [PATCH] [AMDGPU][docs] Replace gfx940 and gfx941 with gfx942 in
 llvm/docs

gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all documentation occurrences of gfx940/gfx941 except
for the gfx940 ISA description, which will be the subject of a separate
PR.

For SWDEV-512631
---
 llvm/docs/AMDGPUOperandSyntax.rst |  4 +-
 llvm/docs/AMDGPUUsage.rst         | 97 ++++++++++---------------------
 2 files changed, 34 insertions(+), 67 deletions(-)

diff --git a/llvm/docs/AMDGPUOperandSyntax.rst 
b/llvm/docs/AMDGPUOperandSyntax.rst
index ff6ec6cf71ff2..e8a76322fe76a 100644
--- a/llvm/docs/AMDGPUOperandSyntax.rst
+++ b/llvm/docs/AMDGPUOperandSyntax.rst
@@ -63,7 +63,7 @@ Note: *N* and *K* must satisfy the following conditions:
 * 0 <= *K* <= 255.
 * *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
 
-GFX90A and GFX940 have an additional alignment requirement:
+GFX90A and GFX942 have an additional alignment requirement:
 pairs of *vector* registers must be even-aligned
 (first register must be even).
 
@@ -183,7 +183,7 @@ Note: *N* and *K* must satisfy the following conditions:
 * 0 <= *K* <= 255.
 * *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
 
-GFX90A and GFX940 have an additional alignment requirement:
+GFX90A and GFX942 have an additional alignment requirement:
 pairs of *accumulator* registers must be even-aligned
 (first register must be even).
 
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 83ec1eecb6e5e..14b3b6fce9e70 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -323,7 +323,7 @@ Every processor supports every OS ABI (see 
:ref:`amdgpu-os`) with the following
                                                                                
                         Add product
                                                                                
                         names.
 
-     **GCN GFX9 (Vega)** [AMD-GCN-GFX900-GFX904-VEGA]_ 
[AMD-GCN-GFX906-VEGA7NM]_ [AMD-GCN-GFX908-CDNA1]_ [AMD-GCN-GFX90A-CDNA2]_ 
[AMD-GCN-GFX940-GFX942-CDNA3]_
+     **GCN GFX9 (Vega)** [AMD-GCN-GFX900-GFX904-VEGA]_ 
[AMD-GCN-GFX906-VEGA7NM]_ [AMD-GCN-GFX908-CDNA1]_ [AMD-GCN-GFX90A-CDNA2]_ 
[AMD-GCN-GFX942-CDNA3]_
      
-----------------------------------------------------------------------------------------------------------------------
      ``gfx900``                  ``amdgcn``   dGPU  - xnack           - 
Absolute      - *rocm-amdhsa* - Radeon Vega
                                                                         flat   
       - *pal-amdhsa*    Frontier Edition
@@ -378,20 +378,6 @@ Every processor supports every OS ABI (see 
:ref:`amdgpu-os`) with the following
                                                                                
                       - Ryzen 3 Pro 4350G
                                                                                
                       - Ryzen 3 Pro 4350GE
 
-     ``gfx940``                  ``amdgcn``   dGPU  - sramecc         - 
Architected                   *TBA*
-                                                    - tgsplit           flat
-                                                    - xnack             
scratch                       .. TODO::
-                                                    - kernarg preload - Packed
-                                                                        
work-item                       Add product
-                                                                        IDs    
                         names.
-
-     ``gfx941``                  ``amdgcn``   dGPU  - sramecc         - 
Architected                   *TBA*
-                                                    - tgsplit           flat
-                                                    - xnack             
scratch                       .. TODO::
-                                                    - kernarg preload - Packed
-                                                                        
work-item                       Add product
-                                                                        IDs    
                         names.
-
      ``gfx942``                  ``amdgcn``   dGPU  - sramecc         - 
Architected                   - AMD Instinct MI300X
                                                     - tgsplit           flat   
                       - AMD Instinct MI300A
                                                     - xnack             scratch
@@ -583,10 +569,10 @@ Generic processor code objects are versioned. See 
:ref:`amdgpu-generic-processor
                                                                                
                   - ``v_dot2_f32_f16``
 
 
-     ``gfx9-4-generic``   ``amdgcn``     - ``gfx940``      - sramecc          
- Architected     FP8 and BF8 instructions,
-                                         - ``gfx941``      - tgsplit           
 flat scratch    FP8 and BF8 conversion
-                                         - ``gfx942``      - xnack            
- Packed          instructions, as well as
-                                         - ``gfx950``      - kernarg preload   
 work-item       instructions with XF32 format
+     ``gfx9-4-generic``   ``amdgcn``     - ``gfx942``      - sramecc          
- Architected     FP8 and BF8 instructions,
+                                         - ``gfx950``      - tgsplit           
 flat scratch    FP8 and BF8 conversion
+                                                           - xnack            
- Packed          instructions, as well as
+                                                           - kernarg preload   
 work-item       instructions with XF32 format
                                                                                
 IDs             support are not available.
 
      ``gfx10-1-generic``  ``amdgcn``     - ``gfx1010``     - xnack            
- Absolute flat   - The following instructions are
@@ -4974,7 +4960,7 @@ The fields used by CP for code objects before V3 also 
match those specified in
              bytes
      383:352 4 bytes COMPUTE_PGM_RSRC3               GFX6-GFX9
                                                        Reserved, must be 0.
-                                                     GFX90A, GFX940
+                                                     GFX90A, GFX942
                                                        Compute Shader (CS)
                                                        program settings used by
                                                        CP to set up
@@ -5059,7 +5045,7 @@ The fields used by CP for code objects before V3 also 
match those specified in
      463:460 4 bits                                  Reserved, must be 0.
      470:464 7 bits  KERNARG_PRELOAD_SPEC_LENGTH     GFX6-GFX9
                                                        - Reserved, must be 0.
-                                                     GFX90A, GFX940
+                                                     GFX90A, GFX942
                                                        - The number of dwords 
from
                                                          the kernarg segment 
to preload
                                                          into User SGPRs 
before kernel
@@ -5067,7 +5053,7 @@ The fields used by CP for code objects before V3 also 
match those specified in
                                                          
:ref:`amdgpu-amdhsa-kernarg-preload`).
      479:471 9 bits  KERNARG_PRELOAD_SPEC_OFFSET     GFX6-GFX9
                                                        - Reserved, must be 0.
-                                                     GFX90A, GFX940
+                                                     GFX90A, GFX942
                                                        - An offset in dwords 
into the
                                                          kernarg segment to 
begin
                                                          preloading data into 
User
@@ -5093,7 +5079,7 @@ The fields used by CP for code objects before V3 also 
match those specified in
                                                      GFX6-GFX9
                                                        - vgprs_used 0..256
                                                        - max(0, 
ceil(vgprs_used / 4) - 1)
-                                                     GFX90A, GFX940
+                                                     GFX90A, GFX942
                                                        - vgprs_used 0..512
                                                        - vgprs_used = 
align(arch_vgprs, 4)
                                                                       + 
acc_vgprs
@@ -5559,7 +5545,7 @@ The fields used by CP for code objects before V3 also 
match those specified in
 
 ..
 
-  .. table:: compute_pgm_rsrc3 for GFX90A, GFX940
+  .. table:: compute_pgm_rsrc3 for GFX90A, GFX942
      :name: amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table
 
      ======= ======= =============================== 
===========================================================================
@@ -9970,15 +9956,15 @@ only accessed by a single thread, and is always 
write-before-read, there is
 never a need to invalidate these entries from the L1 cache. Hence all cache
 invalidates are done as ``*_vol`` to only invalidate the volatile cache lines.
 
-The code sequences used to implement the memory model for GFX940, GFX941, 
GFX942
-are defined in table 
:ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx941-gfx942-table`.
+The code sequences used to implement the memory model for GFX942 are defined in
+table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx942-table`.
 
-  .. table:: AMDHSA Memory Model Code Sequences GFX940, GFX941, GFX942
-     :name: 
amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx941-gfx942-table
+  .. table:: AMDHSA Memory Model Code Sequences GFX942
+     :name: amdgpu-amdhsa-memory-model-code-sequences-gfx942-table
 
      ============ ============ ============== ========== 
================================
      LLVM Instr   LLVM Memory  LLVM Memory    AMDGPU     AMDGPU Machine Code
-                  Ordering     Sync Scope     Address    GFX940, GFX941, GFX942
+                  Ordering     Sync Scope     Address    GFX942
                                               Space
      ============ ============ ============== ========== 
================================
      **Non-Atomic**
@@ -10013,18 +9999,12 @@ are defined in table 
:ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
      load         *none*       *none*         - local    1. ds_load
      store        *none*       *none*         - global   - !volatile & 
!nontemporal
                                               - generic
-                                              - private    1. GFX940, GFX941
+                                              - private    1. GFX942
                                               - constant        
buffer/global/flat_store
-                                                                sc0=1 sc1=1
-                                                              GFX942
-                                                                
buffer/global/flat_store
 
                                                          - !volatile & 
nontemporal
 
-                                                           1. GFX940, GFX941
-                                                                
buffer/global/flat_store
-                                                                nt=1 sc0=1 
sc1=1
-                                                              GFX942
+                                                           1. GFX942
                                                                 
buffer/global/flat_store
                                                                 nt=1
 
@@ -10696,11 +10676,8 @@ are defined in table 
:ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
 
      **Release Atomic**
      
------------------------------------------------------------------------------------
-     store atomic release      - singlethread - global   1. GFX940, GFX941
+     store atomic release      - singlethread - global   1. GFX942
                                - wavefront    - generic       
buffer/global/flat_store
-                                                              sc0=1 sc1=1
-                                                            GFX942
-                                                              
buffer/global/flat_store
 
      store atomic release      - singlethread - local    *If TgSplit execution 
mode,
                                - wavefront               local address space 
cannot
@@ -10738,10 +10715,7 @@ are defined in table 
:ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
                                                              store that is 
being
                                                              released.
 
-                                                         2. GFX940, GFX941
-                                                              
buffer/global/flat_store
-                                                              sc0=1 sc1=1
-                                                            GFX942
+                                                         2. GFX942
                                                               
buffer/global/flat_store
                                                               sc0=1
      store atomic release      - workgroup    - local    *If TgSplit execution 
mode,
@@ -10802,10 +10776,7 @@ are defined in table 
:ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
                                                              store that is 
being
                                                              released.
 
-                                                         3. GFX940, GFX941
-                                                              
buffer/global/flat_store
-                                                              sc0=1 sc1=1
-                                                            GFX942
+                                                         3. GFX942
                                                               
buffer/global/flat_store
                                                               sc1=1
      store atomic release      - system       - global   1. buffer_wbl2 sc0=1 
sc1=1
@@ -17563,11 +17534,7 @@ in this description.
 
     CDNA 2        :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>`             
:doc:`gfx90a<AMDGPU/AMDGPUAsmGFX90a>`
 
-    CDNA 3        :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>`             
:doc:`gfx940<AMDGPU/AMDGPUAsmGFX940>`
-
-                                                                
:doc:`gfx941<AMDGPU/AMDGPUAsmGFX940>`
-
-                                                                
:doc:`gfx942<AMDGPU/AMDGPUAsmGFX940>`
+    CDNA 3        :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>`             
:doc:`gfx942<AMDGPU/AMDGPUAsmGFX940>`
 
     RDNA 1        :doc:`GFX10 RDNA1<AMDGPU/AMDGPUAsmGFX10>`     
:doc:`gfx1010<AMDGPU/AMDGPUAsmGFX10>`
 
@@ -17605,7 +17572,7 @@ combinations of operands, refer to one of instruction 
set architecture manuals
 [AMD-GCN-GFX6]_, [AMD-GCN-GFX7]_, [AMD-GCN-GFX8]_,
 [AMD-GCN-GFX900-GFX904-VEGA]_, [AMD-GCN-GFX906-VEGA7NM]_,
 [AMD-GCN-GFX908-CDNA1]_, [AMD-GCN-GFX90A-CDNA2]_,
-[AMD-GCN-GFX940-GFX942-CDNA3]_, [AMD-GCN-GFX10-RDNA1]_, [AMD-GCN-GFX10-RDNA2]_,
+[AMD-GCN-GFX942-CDNA3]_, [AMD-GCN-GFX10-RDNA1]_, [AMD-GCN-GFX10-RDNA2]_,
 [AMD-GCN-GFX11-RDNA3]_ and [AMD-GCN-GFX11-RDNA3.5]_.
 
 Operands
@@ -18118,7 +18085,7 @@ terminated by an ``.end_amdhsa_kernel`` directive.
                                                                                
                :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`
      ``.amdhsa_user_sgpr_private_segment_buffer``             0                
   GFX6-GFX10   Controls ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER in
                                                                                
   (except      :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
-                                                                               
   GFX940)
+                                                                               
   GFX942)
      ``.amdhsa_user_sgpr_dispatch_ptr``                       0                
   GFX6-GFX12   Controls ENABLE_SGPR_DISPATCH_PTR in
                                                                                
                :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
      ``.amdhsa_user_sgpr_queue_ptr``                          0                
   GFX6-GFX12   Controls ENABLE_SGPR_QUEUE_PTR in
@@ -18129,7 +18096,7 @@ terminated by an ``.end_amdhsa_kernel`` directive.
                                                                                
                :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
      ``.amdhsa_user_sgpr_flat_scratch_init``                  0                
   GFX6-GFX10   Controls ENABLE_SGPR_FLAT_SCRATCH_INIT in
                                                                                
   (except      :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
-                                                                               
   GFX940)
+                                                                               
   GFX942)
      ``.amdhsa_user_sgpr_private_segment_size``               0                
   GFX6-GFX12   Controls ENABLE_SGPR_PRIVATE_SEGMENT_SIZE in
                                                                                
                :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
      ``.amdhsa_wavefront_size32``                             Target           
   GFX10-GFX12  Controls ENABLE_WAVEFRONT_SIZE32 in
@@ -18140,8 +18107,8 @@ terminated by an ``.end_amdhsa_kernel`` directive.
                                                                                
                :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
      ``.amdhsa_system_sgpr_private_segment_wavefront_offset`` 0                
   GFX6-GFX10   Controls ENABLE_PRIVATE_SEGMENT in
                                                                                
   (except      :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
-                                                                               
   GFX940)
-     ``.amdhsa_enable_private_segment``                       0                
   GFX940,      Controls ENABLE_PRIVATE_SEGMENT in
+                                                                               
   GFX942)
+     ``.amdhsa_enable_private_segment``                       0                
   GFX942,      Controls ENABLE_PRIVATE_SEGMENT in
                                                                                
   GFX11-GFX12  :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
      ``.amdhsa_system_sgpr_workgroup_id_x``                   1                
   GFX6-GFX12   Controls ENABLE_SGPR_WORKGROUP_ID_X in
                                                                                
                :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
@@ -18162,14 +18129,14 @@ terminated by an ``.end_amdhsa_kernel`` directive.
                                                                                
                Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
                                                                                
                :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
      ``.amdhsa_accum_offset``                                 Required         
   GFX90A,      Offset of a first AccVGPR in the unified register file.
-                                                                               
   GFX940       Used to calculate ACCUM_OFFSET in
+                                                                               
   GFX942       Used to calculate ACCUM_OFFSET in
                                                                                
                :ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`.
      ``.amdhsa_reserve_vcc``                                  1                
   GFX6-GFX12   Whether the kernel may use the special VCC SGPR.
                                                                                
                Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
                                                                                
                :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
      ``.amdhsa_reserve_flat_scratch``                         1                
   GFX7-GFX10   Whether the kernel may use flat instructions to access
                                                                                
   (except      scratch memory. Used to calculate
-                                                                               
   GFX940)      GRANULATED_WAVEFRONT_SGPR_COUNT in
+                                                                               
   GFX942)      GRANULATED_WAVEFRONT_SGPR_COUNT in
                                                                                
                :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
      ``.amdhsa_reserve_xnack_mask``                           Target           
   GFX8-GFX10   Whether the kernel may trigger XNACK replay.
                                                               Feature          
                Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
@@ -18200,7 +18167,7 @@ terminated by an ``.end_amdhsa_kernel`` directive.
      ``.amdhsa_fp16_overflow``                                0                
   GFX9-GFX12   Controls FP16_OVFL in
                                                                                
                :ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
      ``.amdhsa_tg_split``                                     Target           
   GFX90A,      Controls TG_SPLIT in
-                                                              Feature          
   GFX940,      :ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`.
+                                                              Feature          
   GFX942,      :ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`.
                                                               Specific         
   GFX11-GFX12
                                                               (tgsplit)
      ``.amdhsa_workgroup_processor_mode``                     Target           
   GFX10-GFX12  Controls ENABLE_WGP_MODE in
@@ -18228,9 +18195,9 @@ terminated by an ``.end_amdhsa_kernel`` directive.
      ``.amdhsa_exception_int_div_zero``                       0                
   GFX6-GFX12   Controls ENABLE_EXCEPTION_INT_DIVIDE_BY_ZERO in
                                                                                
                :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
      ``.amdhsa_user_sgpr_kernarg_preload_length``             0                
   GFX90A,      Controls KERNARG_PRELOAD_SPEC_LENGTH in
-                                                                               
   GFX940       :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
+                                                                               
   GFX942       :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
      ``.amdhsa_user_sgpr_kernarg_preload_offset``             0                
   GFX90A,      Controls KERNARG_PRELOAD_SPEC_OFFSET in
-                                                                               
   GFX940       :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
+                                                                               
   GFX942       :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
      ======================================================== 
=================== ============ ===================
 
 .amdgpu_metadata
@@ -18400,7 +18367,7 @@ Additional Documentation
 .. [AMD-GCN-GFX906-VEGA7NM] `AMD Vega 7nm Instruction Set Architecture 
<https://gpuopen.com/wp-content/uploads/2019/11/Vega_7nm_Shader_ISA_26November2019.pdf>`__
 .. [AMD-GCN-GFX908-CDNA1] `AMD Instinct MI100 Instruction Set Architecture 
<https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf>`__
 .. [AMD-GCN-GFX90A-CDNA2] `AMD Instinct MI200 Instruction Set Architecture 
<https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf>`__
-.. [AMD-GCN-GFX940-GFX942-CDNA3] `AMD Instinct MI300 Instruction Set 
Architecture 
<https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf>`__
+.. [AMD-GCN-GFX942-CDNA3] `AMD Instinct MI300 Instruction Set Architecture 
<https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf>`__
 .. [AMD-GCN-GFX10-RDNA1] `AMD RDNA 1.0 Instruction Set Architecture 
<https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf>`__
 .. [AMD-GCN-GFX10-RDNA2] `AMD RDNA 2 Instruction Set Architecture 
<https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf>`__
 .. [AMD-GCN-GFX11-RDNA3] `AMD RDNA 3 Instruction Set Architecture 
<https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf>`__

_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][docs] Replace gfx940 and gfx941 with gfx942 in llvm/docs (PR #126887)

Reply via email to