https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108658
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- The relevant optimization happens in invariant motion which applies store-motion to void * idle (void * ignored) { long int PROF_edge_counter_1; long int PROF_edge_counter_2; <bb 2> [local count: 10631108]: <bb 3> [local count: 1073741824]: PROF_edge_counter_1 = __gcov0.idle[0]; PROF_edge_counter_2 = PROF_edge_counter_1 + 1; __gcov0.idle[0] = PROF_edge_counter_2; goto <bb 3>; [100.00%] producing void * idle (void * ignored) { long int __gcov0.idle_I_lsm.4; long int PROF_edge_counter_1; long int PROF_edge_counter_2; <bb 2> [local count: 10631108]: __gcov0.idle_I_lsm.4_7 = __gcov0.idle[0]; <bb 3> [local count: 1073741824]: # __gcov0.idle_I_lsm.4_6 = PHI <__gcov0.idle_I_lsm.4_7(2), __gcov0.idle_I_lsm.4_8(4)> PROF_edge_counter_1 = __gcov0.idle_I_lsm.4_6; PROF_edge_counter_2 = PROF_edge_counter_1 + 1; __gcov0.idle_I_lsm.4_8 = PROF_edge_counter_2; <bb 4> [local count: 1073741824]: goto <bb 3>; [100.00%] that's not wrong I think. With -fprofile-update=atomic that doesn't happen but the atomic update call never gets a location assigned, instead we rely on the stmt-begin/end notes here? void * idle (void * ignored) { <bb 2> [local count: 10631108]: <bb 3> [local count: 1073741824]: __atomic_fetch_add_8 (&__gcov0.idle[0], 1, 0); goto <bb 3>; [100.00%]