RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit loops

Richard Biener Mon, 27 Oct 2025 07:28:03 -0700

On Mon, 27 Oct 2025, Tamar Christina wrote:

> > -----Original Message-----
> > From: Richard Biener <[email protected]>
> > Sent: 27 October 2025 13:37
> > To: Tamar Christina <[email protected]>
> > Cc: [email protected]; nd <[email protected]>; [email protected]
> > Subject: Re: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit
> > loops
> > 
> > On Sat, 25 Oct 2025, Tamar Christina wrote:
> > 
> > > Arg..
> > >
> > > Just noticed a typo..
> > >
> > > BB 3:
> > >   i_2 = PHI <i_1(2)>
> > >
> > > Should be
> > >
> > >
> > > BB 3:
> > >   i_2 = PHI <i_0(2)>
> > >
> > > Rest is OK. Sorry about that
> > >
> > > ________________________________
> > > From: Tamar Christina
> > > Sent: Saturday, October 25, 2025 10:52 AM
> > > To: [email protected] <[email protected]>
> > > Cc: nd <[email protected]>; [email protected] <[email protected]>;
> > [email protected] <[email protected]>
> > > Subject: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit 
> > > loops
> > >
> > > Consider this Fortran testcase:
> > >
> > > integer (8) b, c
> > >   integer  d
> > >   c = 10
> > >   d = 2
> > >   call e ((/ (b, b = j, c, d) /), 0_8, c, d + 0_8)
> > > contains
> > >   subroutine e (a, f, g, h)
> > >     integer (8), dimension (:) :: a
> > >     integer (8)  f, g, h
> > >     i = 1
> > >     do b = f, g, h
> > >       if (a (i) .ne. b) STOP
> > >       i = i + 1
> > >     end do
> > >     if (size (a) .ne. i - 1) STOP 2
> > >   end
> > > end
> > >
> > > which is a reduced example from the testsuite so not added as a new test.
> > >
> > > This testcase starts the loop at i = 1, but no peeling needs to be done.
> > > As a result the loop form looks like
> > >
> > > BB 1:
> > >   i_0 = PHI <1(initial), i_1(latch)>;
> > >
> > >   if (<early-break-condition>)
> > >     ...
> > >   else
> > >     goto BB 2;
> > >
> > > BB 2:
> > >   i_1 = i_0 + 1;
> > >   if (<main-exit>)
> > >     goto BB 3;
> > >   goto BB 1;
> > >
> > > BB 3:
> > >   i_2 = PHI <i_1(2)>;
> > >
> > > Basically in the main IV exit we don't use the value *after* incrementing 
> > > but
> > > rather the value before incrementing.
> > >
> > > At the moment during peeling we wipe away this information and replace in
> > the
> > > exit the value of i_1.  This is incorrect and means we can no longer tell
> > > which value of the IV we should actually be calculating for that exit.
> > >
> > > This isn't triggered often on trunk because of an artificial inflation of 
> > > the VF
> > > which causes the main exist not be reached very often and such loops 
> > > aren't
> > > common.  However with the scalar IV code this issue was brought to light.
> > >
> > > This patch preserves the value of the LC PHI set during edge redirection. 
> > >  We
> > > end up still creating the new PHI but it uses the correct value now.  The
> > reason
> > > for still creating the duplicate PHI is because there are code outside of
> > > peeling that today depends on the PHI node ordering.  So this was a
> > conservative
> > > fix.  The duplicate PHI gets elided later as it'll be unused.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > -m32, -m64 and no issues
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >         * tree-vect-loop-manip.cc 
> > > (slpeel_tree_duplicate_loop_to_edge_cfg): Fix
> > >         up PHI in exit nodes.
> > >
> > > ---
> > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > index
> > 20141dbc2e54d49bca00a2e75a6733d181e05ed1..96ca273c24680556f16c
> > dc9e465f490d7fcdb8a4 100644
> > > --- a/gcc/tree-vect-loop-manip.cc
> > > +++ b/gcc/tree-vect-loop-manip.cc
> > > @@ -1755,6 +1755,22 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class
> > loop *loop, edge loop_exit,
> > >                      new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi,
> > >                                                       loop_latch_edge 
> > > (loop));
> > >
> > > +                 /* Some loops start at a non-zero index, which then 
> > > require
> > > +                    during redirection to return i instead of i + 1.  
> > > Edge
> > > +                    redirection already deals with this, but we need to 
> > > make
> > > +                    sure we don't override the value it puts in for these
> > > +                    cases.  */
> > > +                 imm_use_iterator imm_iter;
> > > +                 use_operand_p use_p;
> > > +                 tree from_def = PHI_RESULT (from_phi);
> > > +                 bool abnormal_exit_condition = false;
> > > +                 FOR_EACH_IMM_USE_FAST (use_p, imm_iter, from_def)
> > > +                   if (gimple_bb (USE_STMT (use_p)) == 
> > > main_loop_exit_block)
> > > +                     {
> > > +                       abnormal_exit_condition = true;
> > > +                       break;
> > > +                     }
> > > +
> > 
> > I'm a bit confused - isn't this exactly what 'peeled_iters'
> > (aka LOOP_VINFO_EARLY_BREAKS_VECT_PEELED) is about?  That is,
> > this loop is supposed to handle the main exit (and creating
> > all required merge PHIs and eventually missing LC PHIs), so
> > your patch suggests we put in the "wrong" result, but you
> > leave the old 'peeled_iters' code around, even using the
> > original 'caching' (but then for the wrong value?).
> > 
> 
> No, peeled_iters means we want the value at the start of the vector iteration.
> This issue is that we want the value at the end of the vector iteration - 
> <offset>.
> 
> So they aren't the same.  In a normal loop, this adjustment is calculated by
> vect_update_ivs_after_vectorizer through niters because it has the right PHI 
> node
> it doesn't use the incorrect final value of i to adjust the count.
> 
> > At the same time the 2nd loop filling the alternate exit cases
> > is always correct?
> 
> Yes, because alternate exits always restart the loop from the start, so the 
> reduction
> for it takes you before you do any adjustment at all for i.


So I belive the logic

                  /* When the vector loop is peeled then we need to use 
the
                     value at start of the loop, otherwise the main loop 
exit
                     should use the final iter value.  */
                  tree new_arg;
                  if (peeled_iters)
                    new_arg = gimple_phi_result (from_phi);
                  else
                    new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi,
                                                     loop_latch_edge 
(loop));

for !peeled_iters is correct.  In this place we are looking for the
initial value to use for the epilog loop - but in this case this
value is different from the actual live value computed by the testcase.
Note there is in general no way to associate the LC PHI [def] with
the IV in the loop header if it is neither the PHI result nor the
latch value - as it is in this case.  Instead the LC PHI is for
an alltogether different "IV" (but a related, offsetted one).

And I see '5' being used as the continuation value via
vect_update_ivs_after_vectorizer, so that seems correct.

So, I can't really see what is wrong here?  The vectorized code
also looks OK.  It should be possible to set up a C testcase for
this, like

 int i = 1, j;
 do
   {
     if (...) abort ();
     j = i + 1;
   }
 while (...);
 ... use i ...

Richard.

> > 
> > Does the issue show with the fortran testcase on current trunk?
> > 
> 
> Yes, the codegen on trunk today is wrong, but the main exit would never be 
> reached
> because of the increased VF.  So the tests "passes" because it takes the 
> early exit.
> 
> Thanks,
> Tamar
> 
> > Thanks,
> > Richard.
> > 
> > >                    /* Check if we've already created a new phi node 
> > > during edge
> > >                       redirection and re-use it if so.  Otherwise create a
> > >                       LC PHI node to feed the merge PHI.  */
> > > @@ -1769,6 +1785,11 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class
> > loop *loop, edge loop_exit,
> > >                      new_arg = *res;
> > >                    else
> > >                      {
> > > +                     /* If we already found an LC PHI for the current 
> > > PHI then
> > > +                        just re-use it.  */
> > > +                     if (abnormal_exit_condition)
> > > +                       new_arg = from_def;
> > > +
> > >                        /* Create the LC PHI node for the exit.  */
> > >                        tree new_def = copy_ssa_name (new_arg);
> > >                        gphi *lc_phi
> > > @@ -1779,7 +1800,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class
> > loop *loop, edge loop_exit,
> > >
> > >                    /* Create the PHI node in the merge block merging the
> > >                       main and early exit values.  */
> > > -                 tree new_res = copy_ssa_name (gimple_phi_result 
> > > (from_phi));
> > > +                 tree new_res = copy_ssa_name (from_def);
> > >                    gphi *lcssa_phi = create_phi_node (new_res, 
> > > new_preheader);
> > >                    edge main_e = single_succ_edge (main_loop_exit_block);
> > >                    SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi, main_e, new_arg);
> > >
> > >
> > > --
> > >
> > 
> > --
> > Richard Biener <[email protected]>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
> 

-- 
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit loops

Reply via email to