https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79564
Bug ID: 79564 Summary: [missed optimization][x86] relaxed atomic counting compiled the same as seq_cst Product: gcc Version: 7.0.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: marc.mutz at kdab dot com Target Milestone: --- I believe that the following code (https://godbolt.org/g/81DcP8): #include <atomic> int count_relaxed(const char *str) { static std::atomic_int counter = {0}; while (*str++) counter.fetch_add(1, std::memory_order_relaxed); return counter; } could be optimized into register r = __builtin_strlen(str); counter.fetch_add(r, std::memory_order_relaxed); Signed overflow is UB, so can be assumed not to happen. The modification order of a relaxed atomic variable is not related to the modification order of any other variable. All the compiler must ensure is that no value can be read after a later one has been seen by this thread, and that no values are read that weren't written by some other, possibly the same, thread. Performing a single fetch_add() at the end of the loop (possibly guarded by a check for zero to not introduce a (no-op) write where there's none in the source) should be a valid optimization. As the godbolt link shows, it currently is compiled identically to the seq_cst version.