[clang] [llvm] [mlir] [AMDGPU] Verify AMDGPU required workgroup size matches flat workgroup size (PR #200989)

Steffen Larsen via cfe-commits Wed, 10 Jun 2026 00:58:24 -0700

================
@@ -51,6 +51,9 @@ __attribute__((amdgpu_waves_per_eu(0, 4))) kernel void 
kernel_waves_per_eu_0_4()
 __attribute__((amdgpu_flat_work_group_size(64, 32))) kernel void 
kernel_flat_work_group_size_64_32() {} // expected-error 
{{'amdgpu_flat_work_group_size' attribute argument is invalid: min must not be 
greater than max}}
 __attribute__((amdgpu_waves_per_eu(4, 2))) kernel void 
kernel_waves_per_eu_4_2() {} // expected-error {{'amdgpu_waves_per_eu' 
attribute argument is invalid: min must not be greater than max}}
 
+__attribute__((reqd_work_group_size(32, 2, 1), amdgpu_flat_work_group_size(16, 
128))) kernel void 
kernel_reqd_work_group_size_32_2_1_flat_work_group_size_16_128() {} // 
expected-error {{'amdgpu_flat_work_group_size' attribute must match 
'reqd_work_group_size' product}}
+__attribute__((amdgpu_flat_work_group_size(16, 128), reqd_work_group_size(32, 
2, 1))) kernel void 
kernel_flat_work_group_size_16_128_reqd_work_group_size_32_2_1() {} // 
expected-error {{'amdgpu_flat_work_group_size' attribute must match 
'reqd_work_group_size' product}}
+
----------------
steffenlarsen wrote:


Either solution is fine by me. Arguably @krzysz00's suggestion is the least 
disruptive, since we currently let it through and (assuming it produces 
something sensible for the CUDA/HIP path) avoids breaking existing user code.

As a side note, SYCL 2020 has the attribute as well, though under `sycl::`, 
which implementations may just use as an alias. Clang-based implementations of 
that may also prefer that `reqd_work_group_size` is properly maintained in the 
C++ path. Not that is needs to have much of a bearing on what we do for 
CUDA/HIP.

https://github.com/llvm/llvm-project/pull/200989
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [AMDGPU] Verify AMDGPU required workgroup size matches flat workgroup size (PR #200989)

Reply via email to