[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (PR #96872)

Matt Arsenault via llvm-branch-commits Thu, 27 Jun 2024 06:14:24 -0700

================
@@ -49,7 +49,7 @@ void test_s_wait_event_export_ready() {
 }
 
 // CHECK-LABEL: @test_global_add_f32
-// CHECK: {{.*}}call{{.*}} float 
@llvm.amdgcn.global.atomic.fadd.f32.p1.f32(ptr addrspace(1) %{{.*}}, float 
%{{.*}})
+// CHECK: = atomicrmw fadd ptr addrspace(1) %addr, float %x syncscope("agent") 
seq_cst, align 4, !amdgpu.no.fine.grained.memory !{{[0-9]+}}, 
!amdgpu.ignore.denormal.mode !{{[0-9]+$}}
----------------
arsenm wrote:


That's the most conservative option. The current intrinsic handling isn't 
treated as an atomic at all, and the lowering adds a volatile flag instead. 
With seq_cst you end up with an additional cache flush compared to the current 
intrinsic. Release seems to be the strongest ordering that doesn't introduce a 
new flush after 

Running codegen and checking ISA is generally discouraged in clang tests 

https://github.com/llvm/llvm-project/pull/96872
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (PR #96872)

Reply via email to