https://github.com/jhuber6 created 
https://github.com/llvm/llvm-project/pull/185889

Summary:
This can be made generic, which works as expected on NVPTX and SPIR-V.
We do not replace this for AMDGPU because the dedicated built-in has an
extra argument that controls whether or not local memory or global
memory will be invalidated. It would be correct to use this generic
operation there, but we'd lose that minor optimization so we likely
should not regress.


>From afe8e670b75699ef87a307d72ccf7e9bac28d51d Mon Sep 17 00:00:00 2001
From: Joseph Huber <[email protected]>
Date: Wed, 11 Mar 2026 09:18:51 -0500
Subject: [PATCH] [libclc] Add generic clc_mem_fence instruction

Summary:
This can be made generic, which works as expected on NVPTX and SPIR-V.
We do not replace this for AMDGPU because the dedicated built-in has an
extra argument that controls whether or not local memory or global
memory will be invalidated. It would be correct to use this generic
operation there, but we'd lose that minor optimization so we likely
should not regress.
---
 libclc/clc/lib/generic/CMakeLists.txt                         | 1 +
 .../lib/{ptx-nvidiacl => generic}/mem_fence/clc_mem_fence.cl  | 4 +---
 libclc/clc/lib/ptx-nvidiacl/CMakeLists.txt                    | 1 -
 3 files changed, 2 insertions(+), 4 deletions(-)
 rename libclc/clc/lib/{ptx-nvidiacl => generic}/mem_fence/clc_mem_fence.cl 
(83%)

diff --git a/libclc/clc/lib/generic/CMakeLists.txt 
b/libclc/clc/lib/generic/CMakeLists.txt
index 7d7286de11f85..50b12d3bf4e3d 100644
--- a/libclc/clc/lib/generic/CMakeLists.txt
+++ b/libclc/clc/lib/generic/CMakeLists.txt
@@ -159,6 +159,7 @@ libclc_configure_source_list(CLC_GENERIC_SOURCES
   math/clc_tanpi.cl
   math/clc_tgamma.cl
   math/clc_trunc.cl
+  mem_fence/clc_mem_fence.cl
   misc/clc_shuffle.cl
   misc/clc_shuffle2.cl
   relational/clc_all.cl
diff --git a/libclc/clc/lib/ptx-nvidiacl/mem_fence/clc_mem_fence.cl 
b/libclc/clc/lib/generic/mem_fence/clc_mem_fence.cl
similarity index 83%
rename from libclc/clc/lib/ptx-nvidiacl/mem_fence/clc_mem_fence.cl
rename to libclc/clc/lib/generic/mem_fence/clc_mem_fence.cl
index fdec76ebc3c57..ded413308e56c 100644
--- a/libclc/clc/lib/ptx-nvidiacl/mem_fence/clc_mem_fence.cl
+++ b/libclc/clc/lib/generic/mem_fence/clc_mem_fence.cl
@@ -11,8 +11,6 @@
 _CLC_OVERLOAD _CLC_DEF void
 __clc_mem_fence(int memory_scope, int memory_order,
                 __CLC_MemorySemantics memory_semantics) {
-  (void)memory_order;
   (void)memory_semantics;
-  if (memory_scope & (__MEMORY_SCOPE_DEVICE | __MEMORY_SCOPE_WRKGRP))
-    __nvvm_membar_cta();
+  __scoped_atomic_thread_fence(memory_scope, memory_order);
 }
diff --git a/libclc/clc/lib/ptx-nvidiacl/CMakeLists.txt 
b/libclc/clc/lib/ptx-nvidiacl/CMakeLists.txt
index f345007e852e2..6eb0baab1c0bb 100644
--- a/libclc/clc/lib/ptx-nvidiacl/CMakeLists.txt
+++ b/libclc/clc/lib/ptx-nvidiacl/CMakeLists.txt
@@ -4,7 +4,6 @@ libclc_configure_source_list(CLC_PTX_NVIDIACL_SOURCES
   math/clc_rsqrt.cl
   math/clc_sinpi.cl
   math/clc_sqrt.cl
-  mem_fence/clc_mem_fence.cl
   relational/clc_isinf.cl
   synchronization/clc_work_group_barrier.cl
   workitem/clc_get_global_id.cl

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to