Hi Andrew!
On 2025-09-09T16:52:57+0000, Andrew Stubbs <[email protected]> wrote:
> The previous definition had all the GFX11 register counts doubled to fix a bug
> that was encountered in early testing. This seems to have been a
> misunderstanding of the problem (which is no longer reproducible).
I can't comment on the historic aspects, but I can tell that since this
commit r16-3726-g7bc2e311688ac279f1abc2a47944e5b763f7ec89
"amdgcn: fix GFX10/GFX11 VGPR counts", '-march=gfx1100' testing is
completely broken; nothing but:
Memory access fault by GPU node-2 (Agent handle: [...]) on address (nil).
Reason: Page not present or supervisor privilege.
May I 'git push' my 'git revert', or should I keep that local, awaiting
your investigation?
Shouldn't the commit you pushed also have removed the following from
'gcc/config/gcn/gcn.cc:gcn_hsa_declare_function_name':
fprintf (file,
".kd\n"
[...]
" .vgpr_count: %i%s\n"
[...], next_free_vgpr,
(TARGET_WAVE64_COMPAT
? " ; wavefrontsize64 counts double on SIMD32"
: ""));
Grüße
Thomas
> gcc/ChangeLog:
>
> * config/gcn/gcn-devices.def: Correct the Max ISA VGPRs counts for
> GFX10 and GFX11 devices.
> * config/gcn/gcn.cc (gcn_hsa_declare_function_name): Remove the wave64
> VGPR count fudge.
> ---
> gcc/config/gcn/gcn-devices.def | 34 +++++++++++++++++-----------------
> gcc/config/gcn/gcn.cc | 4 ----
> 2 files changed, 17 insertions(+), 21 deletions(-)
>
> diff --git a/gcc/config/gcn/gcn-devices.def b/gcc/config/gcn/gcn-devices.def
> index 426acf0cb7a..b27385bc565 100644
> --- a/gcc/config/gcn/gcn-devices.def
> +++ b/gcc/config/gcn/gcn-devices.def
> @@ -222,7 +222,7 @@ GCN_DEVICE(gfx1030, GFX1030, 0x36, ISA_RDNA2,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX10,
> /* Generic Name */ GFX10_3_GENERIC
> @@ -233,7 +233,7 @@ GCN_DEVICE(gfx1031, GFX1031, 0x37, ISA_RDNA2,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX10,
> /* Generic Name */ GFX10_3_GENERIC
> @@ -244,7 +244,7 @@ GCN_DEVICE(gfx1032, GFX1032, 0x38, ISA_RDNA2,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX10,
> /* Generic Name */ GFX10_3_GENERIC
> @@ -255,7 +255,7 @@ GCN_DEVICE(gfx1033, GFX1033, 0x39, ISA_RDNA2,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX10,
> /* Generic Name */ GFX10_3_GENERIC
> @@ -266,7 +266,7 @@ GCN_DEVICE(gfx1034, GFX1034, 0x3e, ISA_RDNA2,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX10,
> /* Generic Name */ GFX10_3_GENERIC
> @@ -277,7 +277,7 @@ GCN_DEVICE(gfx1035, GFX1035, 0x3d, ISA_RDNA2,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX10,
> /* Generic Name */ GFX10_3_GENERIC
> @@ -288,7 +288,7 @@ GCN_DEVICE(gfx1036, GFX1036, 0x45, ISA_RDNA2,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX10,
> /* Generic Name */ GFX10_3_GENERIC
> @@ -299,7 +299,7 @@ GCN_DEVICE(gfx10-3-generic, GFX10_3_GENERIC, 0x053,
> ISA_RDNA2,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 512, /* 512 SIMD32 = 256 wavefrontsize64. */
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 1,
> /* Architecture Family */ GFX10,
> /* Generic Name */ NONE
> @@ -312,7 +312,7 @@ GCN_DEVICE(gfx1100, GFX1100, 0x41, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536, /* 1536 SIMD32 = 768 wavefrontsize64. */
> + /* Max ISA VGPRs */ 768,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX11,
> /* Generic Name */ GFX11_GENERIC
> @@ -323,7 +323,7 @@ GCN_DEVICE(gfx1101, GFX1101, 0x46, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536,
> + /* Max ISA VGPRs */ 768,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX11,
> /* Generic Name */ GFX11_GENERIC
> @@ -334,7 +334,7 @@ GCN_DEVICE(gfx1102, GFX1102, 0x47, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536,
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX11,
> /* Generic Name */ GFX11_GENERIC
> @@ -345,7 +345,7 @@ GCN_DEVICE(gfx1103, GFX1103, 0x44, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536,
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX11,
> /* Generic Name */ GFX11_GENERIC
> @@ -356,7 +356,7 @@ GCN_DEVICE(gfx1150, GFX1150, 0x43, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536,
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX11,
> /* Generic Name */ GFX11_GENERIC
> @@ -367,7 +367,7 @@ GCN_DEVICE(gfx1151, GFX1151, 0x4a, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536,
> + /* Max ISA VGPRs */ 768,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX11,
> /* Generic Name */ GFX11_GENERIC
> @@ -378,7 +378,7 @@ GCN_DEVICE(gfx1152, GFX1152, 0x55, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536,
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX11,
> /* Generic Name */ GFX11_GENERIC
> @@ -389,7 +389,7 @@ GCN_DEVICE(gfx1153, GFX1153, 0x58, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536,
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 0, /* non-generic */
> /* Architecture Family */ GFX11,
> /* Generic Name */ GFX11_GENERIC
> @@ -400,7 +400,7 @@ GCN_DEVICE(gfx11-generic, GFX11_GENERIC, 0x054, ISA_RDNA3,
> /* SRAM_ECC default */ HSACO_ATTR_UNSUPPORTED,
> /* WAVE64 mode */ HSACO_ATTR_ON,
> /* CU mode */ HSACO_ATTR_ON,
> - /* Max ISA VGPRs */ 1536,
> + /* Max ISA VGPRs */ 512,
> /* Generic code obj version */ 1,
> /* Architecture Family */ GFX11,
> /* Generic Name */ NONE
> diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
> index 665b05321e1..df1c1a5b19b 100644
> --- a/gcc/config/gcn/gcn.cc
> +++ b/gcc/config/gcn/gcn.cc
> @@ -6787,10 +6787,6 @@ gcn_hsa_declare_function_name (FILE *file, const char
> *name,
> avgpr = MAX_NORMAL_AVGPR_COUNT;
> }
>
> - /* SIMD32 devices count double in wavefront64 mode. */
> - if (TARGET_WAVE64_COMPAT)
> - vgpr *= 2;
> -
> /* Round up to the allocation block size. */
> int vgpr_block_size = TARGET_VGPR_GRANULARITY;
> if (vgpr % vgpr_block_size)
> --
> 2.50.0