RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit loops

Tamar Christina Mon, 27 Oct 2025 10:54:16 -0700

> -----Original Message-----
> From: Richard Biener <[email protected]>
> Sent: 27 October 2025 14:27
> To: Tamar Christina <[email protected]>
> Cc: [email protected]; nd <[email protected]>; [email protected]
> Subject: RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit
> loops
> 
> On Mon, 27 Oct 2025, Tamar Christina wrote:
> 
> > > -----Original Message-----
> > > From: Richard Biener <[email protected]>
> > > Sent: 27 October 2025 13:37
> > > To: Tamar Christina <[email protected]>
> > > Cc: [email protected]; nd <[email protected]>; [email protected]
> > > Subject: Re: [PATCH][vect]: Fix LCSSA wiring during peeling of 
> > > multiple-exit
> > > loops
> > >
> > > On Sat, 25 Oct 2025, Tamar Christina wrote:
> > >
> > > > Arg..
> > > >
> > > > Just noticed a typo..
> > > >
> > > > BB 3:
> > > >   i_2 = PHI <i_1(2)>
> > > >
> > > > Should be
> > > >
> > > >
> > > > BB 3:
> > > >   i_2 = PHI <i_0(2)>
> > > >
> > > > Rest is OK. Sorry about that
> > > >
> > > > ________________________________
> > > > From: Tamar Christina
> > > > Sent: Saturday, October 25, 2025 10:52 AM
> > > > To: [email protected] <[email protected]>
> > > > Cc: nd <[email protected]>; [email protected] <[email protected]>;
> > > [email protected] <[email protected]>
> > > > Subject: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit
> loops
> > > >
> > > > Consider this Fortran testcase:
> > > >
> > > > integer (8) b, c
> > > >   integer  d
> > > >   c = 10
> > > >   d = 2
> > > >   call e ((/ (b, b = j, c, d) /), 0_8, c, d + 0_8)
> > > > contains
> > > >   subroutine e (a, f, g, h)
> > > >     integer (8), dimension (:) :: a
> > > >     integer (8)  f, g, h
> > > >     i = 1
> > > >     do b = f, g, h
> > > >       if (a (i) .ne. b) STOP
> > > >       i = i + 1
> > > >     end do
> > > >     if (size (a) .ne. i - 1) STOP 2
> > > >   end
> > > > end
> > > >
> > > > which is a reduced example from the testsuite so not added as a new
> test.
> > > >
> > > > This testcase starts the loop at i = 1, but no peeling needs to be done.
> > > > As a result the loop form looks like
> > > >
> > > > BB 1:
> > > >   i_0 = PHI <1(initial), i_1(latch)>;
> > > >
> > > >   if (<early-break-condition>)
> > > >     ...
> > > >   else
> > > >     goto BB 2;
> > > >
> > > > BB 2:
> > > >   i_1 = i_0 + 1;
> > > >   if (<main-exit>)
> > > >     goto BB 3;
> > > >   goto BB 1;
> > > >
> > > > BB 3:
> > > >   i_2 = PHI <i_1(2)>;
> > > >
> > > > Basically in the main IV exit we don't use the value *after* 
> > > > incrementing
> but
> > > > rather the value before incrementing.
> > > >
> > > > At the moment during peeling we wipe away this information and replace
> in
> > > the
> > > > exit the value of i_1.  This is incorrect and means we can no longer 
> > > > tell
> > > > which value of the IV we should actually be calculating for that exit.
> > > >
> > > > This isn't triggered often on trunk because of an artificial inflation 
> > > > of the
> VF
> > > > which causes the main exist not be reached very often and such loops
> aren't
> > > > common.  However with the scalar IV code this issue was brought to
> light.
> > > >
> > > > This patch preserves the value of the LC PHI set during edge 
> > > > redirection.
> We
> > > > end up still creating the new PHI but it uses the correct value now.  
> > > > The
> > > reason
> > > > for still creating the duplicate PHI is because there are code outside 
> > > > of
> > > > peeling that today depends on the PHI node ordering.  So this was a
> > > conservative
> > > > fix.  The duplicate PHI gets elided later as it'll be unused.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > > -m32, -m64 and no issues
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         * tree-vect-loop-manip.cc 
> > > > (slpeel_tree_duplicate_loop_to_edge_cfg):
> Fix
> > > >         up PHI in exit nodes.
> > > >
> > > > ---
> > > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > > index
> > >
> 20141dbc2e54d49bca00a2e75a6733d181e05ed1..96ca273c24680556f16c
> > > dc9e465f490d7fcdb8a4 100644
> > > > --- a/gcc/tree-vect-loop-manip.cc
> > > > +++ b/gcc/tree-vect-loop-manip.cc
> > > > @@ -1755,6 +1755,22 @@ slpeel_tree_duplicate_loop_to_edge_cfg
> (class
> > > loop *loop, edge loop_exit,
> > > >                      new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi,
> > > >                                                       loop_latch_edge 
> > > > (loop));
> > > >
> > > > +                 /* Some loops start at a non-zero index, which then 
> > > > require
> > > > +                    during redirection to return i instead of i + 1.  
> > > > Edge
> > > > +                    redirection already deals with this, but we need 
> > > > to make
> > > > +                    sure we don't override the value it puts in for 
> > > > these
> > > > +                    cases.  */
> > > > +                 imm_use_iterator imm_iter;
> > > > +                 use_operand_p use_p;
> > > > +                 tree from_def = PHI_RESULT (from_phi);
> > > > +                 bool abnormal_exit_condition = false;
> > > > +                 FOR_EACH_IMM_USE_FAST (use_p, imm_iter, from_def)
> > > > +                   if (gimple_bb (USE_STMT (use_p)) == 
> > > > main_loop_exit_block)
> > > > +                     {
> > > > +                       abnormal_exit_condition = true;
> > > > +                       break;
> > > > +                     }
> > > > +
> > >
> > > I'm a bit confused - isn't this exactly what 'peeled_iters'
> > > (aka LOOP_VINFO_EARLY_BREAKS_VECT_PEELED) is about?  That is,
> > > this loop is supposed to handle the main exit (and creating
> > > all required merge PHIs and eventually missing LC PHIs), so
> > > your patch suggests we put in the "wrong" result, but you
> > > leave the old 'peeled_iters' code around, even using the
> > > original 'caching' (but then for the wrong value?).
> > >
> >
> > No, peeled_iters means we want the value at the start of the vector
> iteration.
> > This issue is that we want the value at the end of the vector iteration -
> <offset>.
> >
> > So they aren't the same.  In a normal loop, this adjustment is calculated by
> > vect_update_ivs_after_vectorizer through niters because it has the right PHI
> node
> > it doesn't use the incorrect final value of i to adjust the count.
> >
> > > At the same time the 2nd loop filling the alternate exit cases
> > > is always correct?
> >
> > Yes, because alternate exits always restart the loop from the start, so the
> reduction
> > for it takes you before you do any adjustment at all for i.
> 
> So I belive the logic
> 
>                   /* When the vector loop is peeled then we need to use
> the
>                      value at start of the loop, otherwise the main loop
> exit
>                      should use the final iter value.  */
>                   tree new_arg;
>                   if (peeled_iters)
>                     new_arg = gimple_phi_result (from_phi);
>                   else
>                     new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi,
>                                                      loop_latch_edge
> (loop));
> 
> for !peeled_iters is correct.  In this place we are looking for the
> initial value to use for the epilog loop - but in this case this
> value is different from the actual live value computed by the testcase.
> Note there is in general no way to associate the LC PHI [def] with
> the IV in the loop header if it is neither the PHI result nor the
> latch value - as it is in this case.  Instead the LC PHI is for
> an alltogether different "IV" (but a related, offsetted one).
> 
> And I see '5' being used as the continuation value via
> vect_update_ivs_after_vectorizer, so that seems correct.
> 
> So, I can't really see what is wrong here?  The vectorized code
> also looks OK.  It should be possible to set up a C testcase for
> this, like
>


As discussed on IRC, I'll move the fix somewhere else then. But might
end up being more disruptive.

Thanks,
Tamar

>  int i = 1, j;
>  do
>    {
>      if (...) abort ();
>      j = i + 1;
>    }
>  while (...);
>  ... use i ...
> 
> Richard.
> 
> > >
> > > Does the issue show with the fortran testcase on current trunk?
> > >
> >
> > Yes, the codegen on trunk today is wrong, but the main exit would never be
> reached
> > because of the increased VF.  So the tests "passes" because it takes the 
> > early
> exit.
> >
> > Thanks,
> > Tamar
> >
> > > Thanks,
> > > Richard.
> > >
> > > >                    /* Check if we've already created a new phi node 
> > > > during edge
> > > >                       redirection and re-use it if so.  Otherwise 
> > > > create a
> > > >                       LC PHI node to feed the merge PHI.  */
> > > > @@ -1769,6 +1785,11 @@ slpeel_tree_duplicate_loop_to_edge_cfg
> (class
> > > loop *loop, edge loop_exit,
> > > >                      new_arg = *res;
> > > >                    else
> > > >                      {
> > > > +                     /* If we already found an LC PHI for the current 
> > > > PHI then
> > > > +                        just re-use it.  */
> > > > +                     if (abnormal_exit_condition)
> > > > +                       new_arg = from_def;
> > > > +
> > > >                        /* Create the LC PHI node for the exit.  */
> > > >                        tree new_def = copy_ssa_name (new_arg);
> > > >                        gphi *lc_phi
> > > > @@ -1779,7 +1800,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg
> (class
> > > loop *loop, edge loop_exit,
> > > >
> > > >                    /* Create the PHI node in the merge block merging the
> > > >                       main and early exit values.  */
> > > > -                 tree new_res = copy_ssa_name (gimple_phi_result 
> > > > (from_phi));
> > > > +                 tree new_res = copy_ssa_name (from_def);
> > > >                    gphi *lcssa_phi = create_phi_node (new_res, 
> > > > new_preheader);
> > > >                    edge main_e = single_succ_edge 
> > > > (main_loop_exit_block);
> > > >                    SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi, main_e, new_arg);
> > > >
> > > >
> > > > --
> > > >
> > >
> > > --
> > > Richard Biener <[email protected]>
> > > SUSE Software Solutions Germany GmbH,
> > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > Nuernberg)
> >
> 
> --
> Richard Biener <[email protected]>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)

RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit loops

Reply via email to