https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65443

--- Comment #4 from vries at gcc dot gnu.org ---
(In reply to vries from comment #3)
> (In reply to vries from comment #2)
> > The problem with this transformation is that '_20 + 1' might overflow,
> > that's what the comment 'This may need some additional preconditioning in
> > case NIT = ~0' refers to.
> 
> AFAIU, we might also move 'ivtmp_6 = ivtmp_y + 1' to the end of bb4. That
> way it's not triggered at loop entry, as before the transformation, 
> eliminating the need for '_20 + 1'.

One thing I overlooked there:

  _20 = n_4(D) + 4294967295;

If n == 0, we don't reach the loop.

If n == 1, we reach the loop, and _20 == 0. And when we reach the loop
condition from loop entry with ivtmp == 0, ivtmp < _20 will evaluate to false,
and we won't even enter the loop. That's the problem we're trying to solve
using '_20 + 1'. And moving 'ivtmp_6 = ivtmp_y + 1' to the end of bb4 doesn't
fix that.

Reply via email to