https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104754
Aldy Hernandez <aldyh at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2022-03-03
CC| |amacleod at redhat dot com
--- Comment #1 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Confirmed on a cross to m68k-unknown-linux-gnu.
Interestingly this may actually be a regression against GCC11, at least on this
target (and possibly the others mentioned though I haven't checked).
The test verifies that there are no calls to foo(). On m68k the gate to foo()
flows through here (threadfull2 dump right before vrp2):
<bb 3> [local count: 715863673]:
# ivtmp.9_23 = PHI <ivtmp.9_24(11), ivtmp.9_7(9)>
bar ();
_2 = (void *) ivtmp.9_23;
_1 = MEM[(long int *)_2];
ivtmp.9_24 = ivtmp.9_23 + 4;
if (_1 == 1)
goto <bb 4>; [20.24%]
else
goto <bb 5>; [79.76%]
<bb 4> [local count: 144890806]:
foo ();
ivtmp.9_24 has been set previously in BB9 to:
ivtmp.9_7 = (unsigned int) &b;
VRP2 can't seem to do anything with the above sequence, since it can't figure
out what _1 is. I suppose it could, since there is enough information to to
get at "b" at -O3.
On x86, where the test passes, we have the following before vrp2:
<bb 3> [local count: 477266310]:
# c_4 = PHI <c_14(7)>
bar ();
_15 = (sizetype) c_4;
_17 = MEM[(long int *)&b + _15 * 8];
if (_17 == 1)
goto <bb 4>; [20.24%]
else
goto <bb 5>; [79.76%]
<bb 4> [local count: 96598701]:
foo ();
c_29 = c_4 + 1;
goto <bb 8>; [100.00%]
which vrp2 can happily optimize to:
<bb 6> [local count: 477266310]:
bar ();
_17 = 0;
if (_17 == 1)
goto <bb 3>; [20.24%]
else
goto <bb 4>; [79.76%]
...
...
<bb 3> [local count: 96598701]:
foo ();
goto <bb 7>; [100.00%]
Thus leading to foo's demise by ccp4.
I haven't dug deep, but this is likely due to the pointer equivalence tracking
we use in evrp/VRP2 not being able to see that this is all funny talk for b[]:
ivtmp.9_7 = (unsigned int) &b;
...
...
# ivtmp.9_23 = PHI <ivtmp.9_24(11), ivtmp.9_7(9)>
_2 = (void *) ivtmp.9_23;
_1 = MEM[(long int *)_2];
if (_1 == 1)
We have plans for a proper pointer range class for GCC13, though I wonder
whether we'll be able to handle the above gymnastics.
FWIW, the above transformation seems to be ivopts at play.
Whereas on x86 we go from:
<bb 3> [local count: 715863673]:
# c_19 = PHI <c_14(12), c_20(10)>
bar ();
_1 = b[c_19][0];
if (_1 == 1)
to:
<bb 3> [local count: 715863673]:
# c_19 = PHI <c_14(12), c_20(10)>
bar ();
_23 = (sizetype) c_19;
_1 = MEM[(long int *)&b + _23 * 8];
if (_1 == 1)
goto <bb 4>; [20.24%]
on m68k we transform the sequence to:
<bb 3> [local count: 715863673]:
# ivtmp.9_23 = PHI <ivtmp.9_24(12), ivtmp.9_7(10)>
bar ();
_2 = (void *) ivtmp.9_23;
_1 = MEM[(long int *)_2];
ivtmp.9_24 = ivtmp.9_23 + 4;
if (_1 == 1)
Perhaps someone with more target-foo can opine.