Bug ID: 85341
           Summary: [nvptx] Implement atomic load
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot
          Reporter: vries at gcc dot
  Target Milestone: ---

[ Follow-up PR of PR84041 - "[nvptx] Hang in for-3.c" ]

Atm the nvptx port does not define an atomic load insn. Consequently, it goes
through the fallback scenario in expand_atomic_load, and ends up generating a
regular load insn combined with a membar.sys memory barrier.

[ Context:

The __atomic_load builtin is defined as:
Built-in Function: type __atomic_load_n (type *ptr, int memorder)

    This built-in function implements an atomic load operation. It returns the
contents of *ptr.

    The valid memory order variants are __ATOMIC_RELAXED, __ATOMIC_SEQ_CST,

The atomic_load insn pattern is described like this (with a local fix applied
for ):

    This pattern implements an atomic load operation with memory model
semantics. Operand 1 is the memory address being loaded from. Operand 0 is the
result of the load. Operand 2 is the memory model to be used for the load

    If not present, the __atomic_load built-in function will resort to a normal
load with memory barriers. 

If we'd define an atomic_load insn pattern, we could be able to use the pointer
operand to deduce a reduced scope (.gpu or .cta) for the memory barrier.

Say we define memory spaces __global and __shared, then we could used 
membar.gpu for __global and membar.cta for __shared.

Of course, we'd have to annotate libgomp/config/nvptx with the appropriate
namespaces, otherwise we keep generating the same code there.

Reply via email to