rampitec requested changes to this revision. rampitec added a comment. This revision now requires changes to proceed.
Needs an IR test, a test for different supported targets, and a negative test for unsupported features. ================ Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:199 +TARGET_BUILTIN(__builtin_amdgcn_global_atomic_fadd_f64, "dd*1di", "t", "gfx90a-insts") +TARGET_BUILTIN(__builtin_amdgcn_global_atomic_fadd_f32, "ff*1fi", "t", "gfx90a-insts") ---------------- Correct attribute for this one in atomic-fadd-insts. In particular it was first added in gfx908 and you would need to test it too. ================ Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:205 + +TARGET_BUILTIN(__builtin_amdgcn_flat_atomic_fadd_f64, "dd*1di", "t", "gfx90a-insts") +TARGET_BUILTIN(__builtin_amdgcn_flat_atomic_fmin_f64, "dd*1di", "t", "gfx90a-insts") ---------------- Flat address space is 0. ================ Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:210 +TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_f64, "dd*3di", "t", "gfx90a-insts") +TARGET_BUILTIN(__builtin_amdgcn_ds_atomic_fadd_f32, "ff*3fi", "t", "gfx90a-insts") + ---------------- This is available since gfx8. Attribute gfx8-insts. ================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:16212 + case AMDGPU::BI__builtin_amdgcn_flat_atomic_fmax_f64: { + Intrinsic::ID IID; + llvm::Type *ArgTy = llvm::Type::getDoubleTy(getLLVMContext()); ---------------- You do not need any of that code. You can directly map a builtin to intrinsic in the IntrinsicsAMDGPU.td. ================ Comment at: clang/test/CodeGenOpenCL/builtins-fp-atomics.cl:112 +kernel void test_flat_global_max(__global double *addr, double x){ + __builtin_amdgcn_flat_atomic_fmax_f64(addr, x, memory_order_relaxed); +} ---------------- arsenm wrote: > gandhi21299 wrote: > > arsenm wrote: > > > If you're going to bother testing the ISA, is it worth testing rtn and no > > > rtn versions? > > Sorry, what do you mean by rtn version? > Most atomics can be optimized if they don't return the in memory value if the > value is unused Certainly yes, because global_atomic_add_f32 did not have return version on gfx908. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106909/new/ https://reviews.llvm.org/D106909 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits