https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

            Bug ID: 114865
           Summary: std::atomic<X>::compare_exchange_strong seems to hang
                    under GCC 13 on Ubuntu 23.04
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pdimov at gmail dot com
  Target Milestone: ---

I'm getting weird hangs on Github Actions when using
`std::atomic<state_type>::compare_exchange_strong` under GCC 13 on Ubuntu 23.04
(only; GCC 12 and earlier on Ubuntu 22.04 and earlier work). `state_type` is
defined as

```
struct state_type
{
    std::uint64_t timestamp;
    std::uint16_t clock_seq;
};
```

and the code doing the CAS is

```
auto oldst = ps_->load( std::memory_order_relaxed );

for( ;; )
{
    auto newst = get_new_state( oldst );

    if( ps_->compare_exchange_strong( oldst, newst, std::memory_order_relaxed,
std::memory_order_relaxed ) )
    {
        state_ = newst;
        break;
    }
}
```

where `ps` is of type `std::atomic<state_type>*`.

At a glance, I see nothing immediately wrong with the generated code
(https://godbolt.org/z/8Ee3hrTz8).

However, when I change `state_type` to

```
struct state_type
{
    std::uint64_t timestamp;
    std::uint16_t clock_seq;
    std::uint16_t padding[ 3 ];
};
```
the hangs disappear. This leads me to think that the problem is caused by the
original struct having padding, which isn't being handled correctly for some
reason.

As we know, `std::atomic<T>::compare_exchange_strong` is carefully specified to
take and return `expected` by reference, such that it can both compare the
entire object as if via `memcmp` (including the padding), and return it as if
by `memcpy`, again including the padding. Even though the padding bits of the
initial value returned by the atomic load are unspecified, at most one
iteration of the loop would be required for the padding bits to converge and
for the CAS to succeed.

However, going by the symptoms alone, this doesn't seem to be the case here.

The problem may well be inside libatomic, of course; I have no way to tell.

One GHA run showing the issue is
https://github.com/boostorg/uuid/actions/runs/8821753835, where only the GCC 13
job times out.

Reply via email to