https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111416

            Bug ID: 111416
           Summary: [Armv7/v8 Mixing Bug]: 64-bit Sequentially Consistent
                    Load can be Reordered before Store of RMW when v7 and
                    v7 Implementations are Mixed
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: translation
          Assignee: unassigned at gcc dot gnu.org
          Reporter: luke.geeson at cs dot ucl.ac.uk
  Target Milestone: ---

Consider the following litmus test that has buggy behaviour:
```
C test

{ int64_t x = 0; int64_t y = 0 }

P0 (_Atomic int64_t *x, _Atomic int64_t *y) {
  atomic_fetch_add_explicit(x,1,memory_order_seq_cst);
  int32_t r0 = atomic_load_explicit(y,memory_order_seq_cst);
}

void P1 (_Atomic int64_t  *x, _Atomic int64_t  *y) {
  atomic_store_explicit(y,1,memory_order_seq_cst);
  int32_t r0 = atomic_load_explicit(x,memory_order_seq_cst);
}

exists P0:r0 = 0 /\ P1:r0 = 0
```
where 'P0:r0 = 0' means thread P0, local variable r0 has value 0

When simulating this test under the C/C++ model from its initial state, the
outcome of execution in the exists clause is forbidden by the source model. The
allowed outcomes are:
```
{ P0:r0=0; P1:r0=1; }
{ P0:r0=1; P1:r0=0; }
{ P0:r0=1; P1:r0=1; }
```
When compiling P1, to target armv7-a cortex-a53
(https://godbolt.org/z/efGnsa19G) using clang trunk, compiling the fetch_add on
P0 to target a cortex-a53 using clang trunk (`ldaexd;add;stlexd` loop), and the
load on P0 to target a cortex-a15 (`ldrd;dmb`) using GCC trunk for cortex-a15.
The compiled program has the following outcomes when simulated under the
aarch32 model:
```
{ P0:r0=0; P1:r0=0; } <--- Forbidden by source model, bug!
{ P0:r0=0; P1:r0=1; }
{ P0:r0=1; P1:r0=0; }
{ P0:r0=1; P1:r0=1; }
```
which is due to the fact the LDRD on P0 can be reordered befofre the stlexd on
P0 since there is no dmb barrier to prevent the reordering.

Since there is no acquire load on armv7, we propose to fix the bug by adding a
fence before the ldrd:
```
dmb ish; ldrd; dmb ish
```
Which prevents the buggy outcome under the aarch32 memory model.

I have validated this bug whilst discussing with Wilco from Arm's compiler
teams.

This bug would not have been caught in normal execution, but only when multiple
implementations are mixed together.
  • [Bug translation/111416] New:... luke.geeson at cs dot ucl.ac.uk via Gcc-bugs

Reply via email to