Bug ID: 85341
Summary: [nvptx] Implement atomic load
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
[ Follow-up PR of PR84041 - "[nvptx] Hang in for-3.c" ]
Atm the nvptx port does not define an atomic load insn. Consequently, it goes
through the fallback scenario in expand_atomic_load, and ends up generating a
regular load insn combined with a membar.sys memory barrier.
The __atomic_load builtin is defined as:
Built-in Function: type __atomic_load_n (type *ptr, int memorder)
This built-in function implements an atomic load operation. It returns the
contents of *ptr.
The valid memory order variants are __ATOMIC_RELAXED, __ATOMIC_SEQ_CST,
__ATOMIC_ACQUIRE, and __ATOMIC_CONSUME.
The atomic_load insn pattern is described like this (with a local fix applied
for https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00517.html ):
This pattern implements an atomic load operation with memory model
semantics. Operand 1 is the memory address being loaded from. Operand 0 is the
result of the load. Operand 2 is the memory model to be used for the load
If not present, the __atomic_load built-in function will resort to a normal
load with memory barriers.
If we'd define an atomic_load insn pattern, we could be able to use the pointer
operand to deduce a reduced scope (.gpu or .cta) for the memory barrier.
Say we define memory spaces __global and __shared, then we could used
membar.gpu for __global and membar.cta for __shared.
Of course, we'd have to annotate libgomp/config/nvptx with the appropriate
namespaces, otherwise we keep generating the same code there.