https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110177
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So if we have a loop copy header pass after pre/lim2, this would be optimized
correctly.
After pre we have:
<bb 2> [local count: 39919534]:
c = 0;
e.1_1 = e;
_8 = g;
goto <bb 13>; [100.00%]
<bb 3>:
...
<bb 13> [local count: 1073741824]:
# c_lsm.16_7 = PHI <0(2), _12(12)>
# c_lsm_flag.17_22 = PHI <0(2), 1(12)>
if (c_lsm.16_7 <= 20)
goto <bb 3>; [96.34%]
else
goto <bb 14>; [3.66%]
<bb 14> [local count: 39298952]:
if (c_lsm_flag.17_22 != 0)
goto <bb 15>; [66.67%]
else
goto <bb 16>; [33.33%]
```
And copying the header gets us c_lsm_flag.17_22 being 1 always exiting the loop
there and then we can remove a few more things.
f is always deferenced and so is no longer can be null.
The reason why it works at -O2 is because well copy loop header happens before
pre/lim and then things just work.