https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123300

--- Comment #9 from Andrew Macleod <amacleod at redhat dot com> ---
hmm.  The algorithm for determining whether an assume can be removed early
checks to make sure that all uses of all exports are dominated by block
containing the branch leading to the unreachable () stmt.  ie

 <bb 2> :
  goto <bb 7>; [INV]

  <bb 3> :
  x_8 = 1 << i_7;
  if (x_8 <= 0)               <<-- this is the branch in BB3
    goto <bb 4>; [INV]
  else
    goto <bb 5>; [INV]

  <bb 4> :
  __builtin_unreachable ();

  <bb 5> :
  if (p_9(D) != 0)
    goto <bb 6>; [INV]
  else
    goto <bb 7>; [INV]

  <bb 6> :
  bar (i_7, x_8);

  <bb 7> :
  # i_2 = PHI <i_7(5), n_4(D)(2), i_7(6)>
  i_7 = i_2 + -1;
  if (i_2 > 0)
    goto <bb 3>; [INV]
  else
    goto <bb 8>; [INV]

Both i_7 and x_8 are exports from bb3,. so we think we can replace the values
as BB3 does in fact dominate all USES of i_7.  There are no uses of i_7 in bb7
or bb8...

What gets missed is that bb3 does NOT dominate the uses of i_2, which can be
recalculated from i_7.   So when it decides to replace i_7 with this new [0,
+INF] range that it thinks is safe,  we quickly recalculate i_2 in bb 7
thinking that i_7 is [0, +inf] and therefore we will always take that branch.

What is missing is that we can only replace i_7 (and any other export) if it
AND allo its dependents are all dominated by bb3.

The dependency list for i_7 is:
         i_7 : i_2(I)

and if we check i_2, we would discover that bb3 does not dominate the use of
i_2 in the bb9 branch.

Its a touch trickier than that in real life (it doesn't dominate the use in the
statement i_7 = i_2 + -1 either, but that one is OK :-P), and dependency chains
can span multiple statements and uses. 

The simple and safe solution is probably to only allow early removal if the
uses in the definition of the export (so in this case i_2) are simple one
immediate-use names.  It would then fail to perform the optimization in this
case, but would not limit cases where a sequence of instructions feeding the
definition of i_7 are only used in calculating i_7.   Which is probably most
cases we see.

Let me work with it for a day.

Reply via email to