https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84411
--- Comment #4 from Drea Pinski <pinskia at gcc dot gnu.org> ---
So the way I am thinking about fixing this is pattern matching (ignoring the
__atomic_load_8 part since it will be removed) starting at
__cxa_guard_acquire:
```
<bb 2> [local count: 1073741824]:
# .MEM_6 = VDEF <.MEM_5(D)>
_1 = __atomic_load_8 (&_ZGVZ1fvE1a, 2);
_2 = _1 & 1;
if (_2 == 0)
goto <bb 3>; [33.00%]
else
goto <bb 5>; [67.00%]
<bb 3> [local count: 354334800]:
# .MEM_7 = VDEF <.MEM_6>
_3 = __cxa_guard_acquire (&_ZGVZ1fvE1a);
if (_3 != 0)
goto <bb 4>; [33.00%]
else
goto <bb 5>; [67.00%]
<bb 4> [local count: 116930483]:
# .MEM_8 = VDEF <.MEM_7>
__cxa_guard_release (&_ZGVZ1fvE1a);
<bb 5> [local count: 1073741824]:
```
in forwprop and then removing __cxa_guard_release and changing the `_3 =
__cxa_guard_acquire` into just `_3 = 1`.