> -----Original Message-----
> From: Richard Biener <rguent...@suse.de>
> Sent: Wednesday, December 6, 2023 9:15 AM
> To: Tamar Christina <tamar.christ...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; j...@ventanamicro.com
> Subject: RE: [PATCH 13/21]middle-end: Update loop form analysis to support 
> early
> break
> 
> On Wed, 6 Dec 2023, Tamar Christina wrote:
> 
> > > -----Original Message-----
> > > From: Richard Biener <rguent...@suse.de>
> > > Sent: Wednesday, December 6, 2023 8:18 AM
> > > To: Tamar Christina <tamar.christ...@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; j...@ventanamicro.com
> > > Subject: Re: [PATCH 13/21]middle-end: Update loop form analysis to support
> early
> > > break
> > >
> > > On Mon, 6 Nov 2023, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This sets LOOP_VINFO_EARLY_BREAKS and does some misc changes so the
> other
> > > > patches are self contained.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         * tree-vect-loop.cc (vect_analyze_loop_form): Analyse all exits.
> > > >         (vect_create_loop_vinfo): Set LOOP_VINFO_EARLY_BREAKS.
> > > >         (vect_transform_loop): Use it.
> > > >
> > > > --- inline copy of patch --
> > > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> > > > index
> > >
> 51a054c5b035ac80dfbbf3b5ba2f6da82fda91f6..f9483eff6e9606e835906fb991
> > > f07cd6052491d0 100644
> > > > --- a/gcc/tree-vect-loop.cc
> > > > +++ b/gcc/tree-vect-loop.cc
> > > > @@ -1700,12 +1700,12 @@ vect_compute_single_scalar_iteration_cost
> > > (loop_vec_info loop_vinfo)
> > > >    loop_vinfo->scalar_costs->finish_cost (nullptr);
> > > >  }
> > > >
> > > > -
> > > >  /* Function vect_analyze_loop_form.
> > > >
> > > >     Verify that certain CFG restrictions hold, including:
> > > >     - the loop has a pre-header
> > > > -   - the loop has a single entry and exit
> > > > +   - the loop has a single entry
> > > > +   - nested loops can have only a single exit.
> > > >     - the loop exit condition is simple enough
> > > >     - the number of iterations can be analyzed, i.e, a countable loop.  
> > > > The
> > > >       niter could be analyzed under some assumptions.  */
> > > > @@ -1841,10 +1841,14 @@ vect_analyze_loop_form (class loop *loop,
> > > vect_loop_form_info *info)
> > > >                                    "not vectorized: latch block not 
> > > > empty.\n");
> > > >
> > > >    /* Make sure the exit is not abnormal.  */
> > > > -  if (exit_e->flags & EDGE_ABNORMAL)
> > > > -    return opt_result::failure_at (vect_location,
> > > > -                                  "not vectorized:"
> > > > -                                  " abnormal loop exit edge.\n");
> > > > +  auto_vec<edge> exits = get_loop_exit_edges (loop);
> > > > +  for (edge e : exits)
> > >
> > > Seeing this multiple times, this isn't the most efficient way to
> > > iterate over all exits with LOOPS_HAVE_RECORDED_EXITS.
> > >
> > > Note to myself: fix (add to) the API.
> > >
> > > > +    {
> > > > +      if (e->flags & EDGE_ABNORMAL)
> > > > +       return opt_result::failure_at (vect_location,
> > > > +                                      "not vectorized:"
> > > > +                                      " abnormal loop exit edge.\n");
> > > > +    }
> > > >
> > > >    info->conds
> > > >      = vect_get_loop_niters (loop, exit_e, &info->assumptions,
> > > > @@ -1920,6 +1924,10 @@ vect_create_loop_vinfo (class loop *loop,
> > > vec_info_shared *shared,
> > > >
> > > >    LOOP_VINFO_IV_EXIT (loop_vinfo) = info->loop_exit;
> > > >
> > > > +  /* Check to see if we're vectorizing multiple exits.  */
> > > > +  LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> > > > +    = !LOOP_VINFO_LOOP_CONDS (loop_vinfo).is_empty ();
> > > > +
> > >
> > > Seeing this,
> s/LOOP_VINFO_LOOP_CONDS/LOOP_VINFO_LOOP_EXIT_CONDS/g
> > > might be good, if we in future avoid if-conversion in a separate
> > > pass we will have other CONDs as well.
> > >
> > > >    if (info->inner_loop_cond)
> > > >      {
> > > >        stmt_vec_info inner_loop_cond_info
> > > > @@ -11577,7 +11585,7 @@ vect_transform_loop (loop_vec_info
> loop_vinfo,
> > > gimple *loop_vectorized_call)
> > > >    /* Make sure there exists a single-predecessor exit bb.  Do this 
> > > > before
> > > >       versioning.   */
> > > >    edge e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > > -  if (! single_pred_p (e->dest))
> > > > +  if (! single_pred_p (e->dest) && !LOOP_VINFO_EARLY_BREAKS
> (loop_vinfo))
> > > >      {
> > > >        split_loop_exit_edge (e, true);
> > >
> > > Note this splitting is done to fulfil versioning constraints on CFG
> > > update.  Do you have test coverage with alias versioning and early
> > > breaks?
> >
> > No, only non-alias versioning.  I don't believe we can alias in the current
> > implementation because it's restricted to statically known objects with
> > a fixed size.
> 
> Hm, if side-effects are all correctly in place do we still have that
> restriction?
> 
> int x;
> void foo (int *a, int *b)
> {
>   int local_x = x;
>   for (int i = 0; i < 1024; ++i)
>     {
>       if (i % local_x == 13)
>         break;
>       a[i] = 2 * b[i];
>     }
> }
> 
> the early exit isn't SCEV analyzable but doesn't depend on any
> memory and all side-effects are after the exit already.  But
> vectorizing would require alias versioning.

Oh, you're right.  A slightly simpler testcase:

int x;                                                                          
                                                                                
                                                                                
                             void foo (int *a, int *b)
{
  int local_x = x;
  for (int i = 0; i < 1024; ++i)
    {
      if (i + local_x == 13)
        break;
      a[i] = 2 * b[i];
    }
}

Vectorizes and has added the correct alias check.  

uwl-alias.c:9:19: missed:   versioning for alias required: can't determine 
dependence between *_4 and *_6                                                  
                                                                                
                                  consider run-time aliasing test between *_4 
and *_6
merged alias checks:
  reference:      *_4 vs. *_6
  segment length: 12
  access size:    4
  alignment:      4
  flags:          WAR
uwl-alias.c:5:21: note:   improved number of alias checks from 1 to 1
uwl-alias.c:5:21: note:  created 1 versioning for alias checks.                 
                                                                                
                                                                                
                             uwl-alias.c:5:21: note:  trying to apply 
versioning to outer loop 0

So I'll make it a runtime test and added it to the testsuite.

Cheers,
Tamar

> 
> Richard.

Reply via email to