================ @@ -49,7 +49,7 @@ void test_s_wait_event_export_ready() { } // CHECK-LABEL: @test_global_add_f32 -// CHECK: {{.*}}call{{.*}} float @llvm.amdgcn.global.atomic.fadd.f32.p1.f32(ptr addrspace(1) %{{.*}}, float %{{.*}}) +// CHECK: = atomicrmw fadd ptr addrspace(1) %addr, float %x syncscope("agent") seq_cst, align 4, !amdgpu.no.fine.grained.memory !{{[0-9]+}}, !amdgpu.ignore.denormal.mode !{{[0-9]+$}} ---------------- arsenm wrote:
That's the most conservative option. The current intrinsic handling isn't treated as an atomic at all, and the lowering adds a volatile flag instead. With seq_cst you end up with an additional cache flush compared to the current intrinsic. Release seems to be the strongest ordering that doesn't introduce a new flush after Running codegen and checking ISA is generally discouraged in clang tests https://github.com/llvm/llvm-project/pull/96872 _______________________________________________ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits