RE: [PATCH 7/21]middle-end: update IV update code to support early breaks and arbitrary exits

Tamar Christina Fri, 17 Nov 2023 02:41:15 -0800

> > > > > Yes, but that only works for the inductions marked so.  We'd
> > > > > need to mark the others as well, but only for the early exits.
> > > > >
> > > > > > although I don't understand why we use the scalar count,  I
> > > > > > suppose the reasoning is that we don't really want to keep it
> > > > > > around, and referencing
> > > > > it forces it to be kept?
> > > > >
> > > > > Referencing it will cause the scalar compute to be retained, but
> > > > > since we do not adjust the scalar compute during vectorization
> > > > > (but expect it to be dead) the scalar compute will compute the
> > > > > wrong thing (as shown by the reduction example - I suspect
> > > > > inductions will suffer
> > > from the same problem).
> > > > >
> > > > > > At the moment it just does `init + (final - init) * vf` which is 
> > > > > > correct no?
> > > > >
> > > > > The issue is that 'final' is not computed correctly in the
> > > > > vectorized loop.  This formula might work for affine evolutions of
> course.
> > > > >
> > > > > Extracting the correct value from the vectorized induction would
> > > > > be the preferred solution.
> > > >
> > > > Ok, so I should be able to just mark IVs as live during
> > > > process_use if there are multiple exits right? Since it's just
> > > > gonna be unused on the main exit since we use niters?
> > > >
> > > > Because since it's the PHI inside the loop that needs to be marked
> > > > live I can't just do it for a specific exits no?
> > > >
> > > > If I create a copy of the PHI node during peeling for use in early
> > > > exits and mark it live it won't work no?
> > >
> > > I guess I wouldn't actually mark it STMT_VINFO_LIVE_P but somehow
> > > arrange vectorizable_live_operation to be called, possibly adding a
> > > edge argument to that as well.
> > >
> > > Maybe the thing to do for the moment is to reject vectorization with
> > > early breaks if there's any (non-STMT_VINFO_LIVE_P?) induction or
> > > reduction besides the main counting IV one you can already special-case?
> >
> > Ok so I did a quick hack with:
> >
> >       if (!virtual_operand_p (PHI_RESULT (phi))
> >       && !STMT_VINFO_LIVE_P (phi_info))
> >     {
> >       use_operand_p use_p;
> >       imm_use_iterator imm_iter;
> >       bool non_exit_use = false;
> >       FOR_EACH_IMM_USE_FAST (use_p, imm_iter, PHI_RESULT (phi))
> >         if (!flow_bb_inside_loop_p (loop, gimple_bb (USE_STMT (use_p))))
> >           for (auto exit : get_loop_exit_edges (loop))
> >             {
> >               if (exit == LOOP_VINFO_IV_EXIT (loop_vinfo))
> >                 continue;
> >
> >               if (gimple_bb (USE_STMT (use_p)) != exit->dest)
> >                 {
> >                   non_exit_use = true;
> >                   goto fail;
> >                 }
> >             }
> > fail:
> >       if (non_exit_use)
> >         return false;
> >     }
> >
> > And it does seem to still allow all the cases I want.  I've placed
> > this in vect_can_advance_ivs_p.
> >
> > Does this cover what you meant?
> >
> 
> Ok, I've rewritten this in a nicer form, but doesn't this mean we now block 
> any
> loop there the index is not live?
> i.e. we block such simple loops like
> 
> #ifndef N
> #define N 800
> #endif
> unsigned vect_a[N];
> 
> unsigned test4(unsigned x)
> {
>  unsigned ret = 0;
>  for (int i = 0; i < N; i++)
>  {
>    if (vect_a[i]*2 != x)
>      break;
>    vect_a[i] = x;
>  }
>  return ret;
> }
> 
> because it does a simple `break`.  If I force it to be live it works, but 
> then I need
> to differentiate between the counter and the IV.
> 
> # i_15 = PHI <i_12(6), 0(2)>
> # ivtmp_7 = PHI <ivtmp_14(6), 803(2)>
> 
> I seems like if we don't want to keep i_15 around (at the moment it will be 
> kept
> because of its usage in the exit block it won't be DCEd) then we need to mark 
> it
> live early during analysis.
> 
> Most likely if we do this I don't need to care about the "inverted" workflow
> here at all. What do you think?
> 
> Yes that doesn't work for SLP, but I don't think I can get SLP working in the
> remaining time anyway..
> 
> I'll fix reduction and multiple exit live values in the mean time.
>


Ok, so I currently have the following solution.  Let me know if you agree with 
it
and I'll polish it up today and tomorrow and respin things.

1. During vect_update_ivs_after_vectorizer we no longer touch any PHIs aside 
from
     Just updating IVtemps with the expected remaining iteration count.
2. During vect_transform_loop after vectorizing any induction or reduction I 
call vectorizable_live_operation
     For any phi node that still has any usages in the early exit merge block.
3. vectorizable_live_operation is taught to have to materialize the same PHI in 
multiple exits
4. vectorizable_reduction or maybe vect_create_epilog_for_reduction need to be 
modified to for early exits materialize
    The previous iteration value.

This seems to work and produces now for the simple loop above:

.L2:
        str     q27, [x1, x3]
        str     q29, [x2, x1]
        add     x1, x1, 16
        cmp     x1, 3200
        beq     .L11
.L4:
        ldr     q31, [x2, x1]
        mov     v28.16b, v30.16b
        add     v30.4s, v30.4s, v26.4s
        shl     v31.4s, v31.4s, 1
        add     v27.4s, v28.4s, v29.4s
        cmeq    v31.4s, v31.4s, v29.4s
        not     v31.16b, v31.16b
        umaxp   v31.4s, v31.4s, v31.4s
        fmov    x4, d31
        cbz     x4, .L2
        fmov    w1, s28
        mov     w6, 4                                                           
                                                                                
                                                                                
                             .L3:

so now the scalar index is no longer kept and it reduces the value from the 
vector IV in the exit:

fmov    w1, s28

Does this work as you expected?

Thanks,
Tamar

> Thanks,
> Tamar
> > Thanks,
> > Tamar
> >
> > >
> > > Richard.
> > >
> > > > Tamar
> > > > >
> > > > > > Also you missed the question below about how to avoid the
> > > > > > creation of the block, You ok with changing that?
> > > > > >
> > > > > > Thanks,
> > > > > > Tamar
> > > > > >
> > > > > > > Or for now disable early-break for inductions that are not
> > > > > > > the main exit control IV (in vect_can_advance_ivs_p)?
> > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > It seems your change handles different kinds of
> > > > > > > > > > > inductions
> > > > > differently.
> > > > > > > > > > > Specifically
> > > > > > > > > > >
> > > > > > > > > > >       bool ivtemp = gimple_cond_lhs (cond) == iv_var;
> > > > > > > > > > >       if (restart_loop && ivtemp)
> > > > > > > > > > >         {
> > > > > > > > > > >           type = TREE_TYPE (gimple_phi_result (phi));
> > > > > > > > > > >           ni = build_int_cst (type, vf);
> > > > > > > > > > >           if (inversed_iv)
> > > > > > > > > > >             ni = fold_build2 (MINUS_EXPR, type, ni,
> > > > > > > > > > >                               fold_convert (type, 
> > > > > > > > > > > step_expr));
> > > > > > > > > > >         }
> > > > > > > > > > >
> > > > > > > > > > > it looks like for the exit test IV we use either 'VF' or 
> > > > > > > > > > > 'VF - step'
> > > > > > > > > > > as the new value.  That seems to be very odd special
> > > > > > > > > > > casing for unknown reasons.  And while you adjust
> > > > > > > > > > > vec_step_op_add, you don't adjust
> > > > > > > > > > > vect_peel_nonlinear_iv_init (maybe not supported -
> > > > > > > > > > > better assert
> > > > > > > > > here).
> > > > > > > > > >
> > > > > > > > > > The VF case is for a normal "non-inverted" loop, where
> > > > > > > > > > if you take an early exit you know that you have to do
> > > > > > > > > > at most VF
> > > iterations.
> > > > > > > > > > The VF
> > > > > > > > > > - step is to account for the inverted loop control
> > > > > > > > > > flow where you exit after adjusting the IV already by + 
> > > > > > > > > > step.
> > > > > > > > >
> > > > > > > > > But doesn't that assume the IV counts from niter to zero?
> > > > > > > > > I don't see this special case is actually necessary, no?
> > > > > > > > >
> > > > > > > >
> > > > > > > > I needed it because otherwise the scalar loop iterates one
> > > > > > > > iteration too little So I got a miscompile with the
> > > > > > > > inverter loop stuff.  I'll look at it again perhaps It can be 
> > > > > > > > solved
> differently.
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Peeling doesn't matter here, since you know you were
> > > > > > > > > > able to do a vector iteration so it's safe to do VF 
> > > > > > > > > > iterations.
> > > > > > > > > > So having peeled doesn't affect the remaining iters count.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Also the vec_step_op_add case will keep the original
> > > > > > > > > > > scalar IV live even when it is a vectorized induction.
> > > > > > > > > > > The code recomputing the value from scratch avoids this.
> > > > > > > > > > >
> > > > > > > > > > >       /* For non-main exit create an intermediat
> > > > > > > > > > > edge to get any updated
> > > > > > > iv
> > > > > > > > > > >          calculations.  */
> > > > > > > > > > >       if (needs_interm_block
> > > > > > > > > > >           && !iv_block
> > > > > > > > > > >           && (!gimple_seq_empty_p (stmts) ||
> > > > > > > > > > > !gimple_seq_empty_p
> > > > > > > > > > > (new_stmts)))
> > > > > > > > > > >         {
> > > > > > > > > > >           iv_block = split_edge (update_e);
> > > > > > > > > > >           update_e = single_succ_edge (update_e->dest);
> > > > > > > > > > >           last_gsi = gsi_last_bb (iv_block);
> > > > > > > > > > >         }
> > > > > > > > > > >
> > > > > > > > > > > this is also odd, can we adjust the API instead?  I
> > > > > > > > > > > suppose this is because your computation uses the
> > > > > > > > > > > original loop IV, if you based the computation off
> > > > > > > > > > > the initial value only this might not be
> > > > > > > necessary?
> > > > > > > > > >
> > > > > > > > > > No, on the main exit the code updates the value in the
> > > > > > > > > > loop header and puts the Calculation in the merge block.
> > > > > > > > > > This works because it only needs to consume PHI nodes
> > > > > > > > > > in the merge block and things like niters are
> > > > > > > > > adjusted in the guard block.
> > > > > > > > > >
> > > > > > > > > > For an early exit, we don't have a guard block, only
> > > > > > > > > > the merge
> > > block.
> > > > > > > > > > We have to update the PHI nodes in that block,  but
> > > > > > > > > > can't do so since you can't produce a value and
> > > > > > > > > > consume it in a PHI node in the same
> > > > > > > BB.
> > > > > > > > > > So we need to create the block to put the values in
> > > > > > > > > > for use in the merge block.  Because there's no "guard"
> > > > > > > > > > block for early
> > > exits.
> > > > > > > > >
> > > > > > > > > ?  then compute niters in that block as well.
> > > > > > > >
> > > > > > > > We can't since it'll not be reachable through the right edge.
> > > > > > > > What we can do if you want is slightly change peeling, we
> > > > > > > > currently peel
> > > > > as:
> > > > > > > >
> > > > > > > >   \        \             /
> > > > > > > >   E1     E2        Normal exit
> > > > > > > >     \       |          |
> > > > > > > >        \    |          Guard
> > > > > > > >           \ |          |
> > > > > > > >          Merge block
> > > > > > > >                   |
> > > > > > > >              Pre Header
> > > > > > > >
> > > > > > > > If we instead peel as:
> > > > > > > >
> > > > > > > >
> > > > > > > >   \        \             /
> > > > > > > >   E1     E2        Normal exit
> > > > > > > >     \       |          |
> > > > > > > >        Exit join   Guard
> > > > > > > >           \ |          |
> > > > > > > >          Merge block
> > > > > > > >                   |
> > > > > > > >              Pre Header
> > > > > > > >
> > > > > > > > We can use the exit join block.  This would also mean
> > > > > > > > vect_update_ivs_after_vectorizer Doesn't need to iterate
> > > > > > > > over all exits and only really needs to adjust the phi
> > > > > > > > nodes Coming out of the exit join
> > > > > > > and guard block.
> > > > > > > >
> > > > > > > > Does this work for you?
> > > > >
> > > > > Yeah, I think that would work.  But I'd like to sort out the
> > > > > correctness details of the IV update itself before sorting out
> > > > > this code
> > > placement detail.
> > > > >
> > > > > Richard.
> > > > >
> > > > > > > > Thanks,
> > > > > > > > Tamar
> > > > > > > > >
> > > > > > > > > > The API can be adjusted by always creating the empty
> > > > > > > > > > block either during
> > > > > > > > > peeling.
> > > > > > > > > > That would prevent us from having to do anything special 
> > > > > > > > > > here.
> > > > > > > > > > Would that work better?  Or I can do it in the loop
> > > > > > > > > > that iterates over the exits to before the call to
> > > > > > > > > > vect_update_ivs_after_vectorizer, which I think
> > > > > > > > > might be more consistent.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > That said, I wonder why we cannot simply pass in an
> > > > > > > > > > > adjusted niter which would be niters_vector_mult_vf
> > > > > > > > > > > - vf and be done with
> > > > > that?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > We can ofcourse not have this and recompute it from
> > > > > > > > > > niters itself, however this does affect the epilog code 
> > > > > > > > > > layout.
> > > > > > > > > > Particularly knowing the static number if iterations
> > > > > > > > > > left causes it to usually unroll the loop and share
> > > > > > > > > > some of the computations.  i.e. the scalar code is
> > > > > > > > > > often more
> > > > > > > > > efficient.
> > > > > > > > > >
> > > > > > > > > > The computation would be niters_vector_mult_vf -
> > > > > > > > > > iters_done * vf, since the value put Here is the
> > > > > > > > > > remaining iteration
> > > count.
> > > > > > > > > > It's static for early
> > > > > > > > > exits.
> > > > > > > > >
> > > > > > > > > Well, it might be "static" in that it doesn't really
> > > > > > > > > matter what you use for the epilog main IV initial value
> > > > > > > > > as long as you are sure you're not going to take that
> > > > > > > > > exit as you are sure we're going to take one of the
> > > > > > > > > early exits.  So yeah, the special code is probably OK,
> > > > > > > > > but it needs a better comment and as said the structure
> > > > > > > > > of
> > > > > > > vect_update_ivs_after_vectorizer is a bit hard to follow now.
> > > > > > > > >
> > > > > > > > > As said an important part for optimization is to not
> > > > > > > > > keep the scalar IVs live in the vector loop.
> > > > > > > > >
> > > > > > > > > > But can do whatever you prefer here.  Let me know what
> > > > > > > > > > you prefer for the
> > > > > > > > > above.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Tamar
> > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Richard.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > Tamar
> > > > > > > > > > > > >
> > > > > > > > > > > > > > It has to do this since you have to perform
> > > > > > > > > > > > > > the side effects for the non-matching elements 
> > > > > > > > > > > > > > still.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > Tamar
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +             if (STMT_VINFO_LIVE_P (phi_info))
> > > > > > > > > > > > > > > > +               continue;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +             /* For early break the final loop 
> > > > > > > > > > > > > > > > IV is:
> > > > > > > > > > > > > > > > +                init + (final - init) * vf 
> > > > > > > > > > > > > > > > which takes
> > > > > > > > > > > > > > > > +into account
> > > > > > > peeling
> > > > > > > > > > > > > > > > +                values and non-single steps.  
> > > > > > > > > > > > > > > > The
> > main
> > > > > > > > > > > > > > > > +exit
> > > > > > > can
> > > > > > > > > > > > > > > > +use
> > > > > > > > > > > niters
> > > > > > > > > > > > > > > > +                since if you exit from the 
> > > > > > > > > > > > > > > > main exit
> > > > > > > > > > > > > > > > +you've
> > > > > > > done
> > > > > > > > > > > > > > > > +all
> > > > > > > > > > > vector
> > > > > > > > > > > > > > > > +                iterations.  For an early exit 
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > +don't know
> > > > > > > when
> > > > > > > > > > > > > > > > +we
> > > > > > > > > > > exit
> > > > > > > > > > > > > > > > +so
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > +                must re-calculate this on the 
> > > > > > > > > > > > > > > > exit.  */
> > > > > > > > > > > > > > > > +             tree start_expr = 
> > > > > > > > > > > > > > > > gimple_phi_result (phi);
> > > > > > > > > > > > > > > > +             off = fold_build2 (MINUS_EXPR, 
> > > > > > > > > > > > > > > > stype,
> > > > > > > > > > > > > > > > +                                fold_convert 
> > > > > > > > > > > > > > > > (stype,
> > > > > > > start_expr),
> > > > > > > > > > > > > > > > +                                fold_convert 
> > > > > > > > > > > > > > > > (stype,
> > > > > > > init_expr));
> > > > > > > > > > > > > > > > +             /* Now adjust for VF to get the
> > > > > > > > > > > > > > > > +final
> > iteration value.
> > > > > > > */
> > > > > > > > > > > > > > > > +             off = fold_build2 (MULT_EXPR, 
> > > > > > > > > > > > > > > > stype, off,
> > > > > > > > > > > > > > > > +                                build_int_cst 
> > > > > > > > > > > > > > > > (stype,
> > vf));
> > > > > > > > > > > > > > > > +           }
> > > > > > > > > > > > > > > > +         else
> > > > > > > > > > > > > > > > +           off = fold_build2 (MULT_EXPR, stype,
> > > > > > > > > > > > > > > > +                              fold_convert 
> > > > > > > > > > > > > > > > (stype,
> > niters),
> > > > > > > step_expr);
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           if (POINTER_TYPE_P (type))
> > > > > > > > > > > > > > > >             ni = fold_build_pointer_plus 
> > > > > > > > > > > > > > > > (init_expr, off);
> > > > > > > > > > > > > > > >           else
> > > > > > > > > > > > > > > > @@ -2238,6 +2286,8 @@
> > > > > > > > > > > > > > > > vect_update_ivs_after_vectorizer
> > > > > > > > > > > > > > > > (loop_vec_info
> > > > > > > > > > > > > > > loop_vinfo,
> > > > > > > > > > > > > > > >        /* Don't bother call 
> > > > > > > > > > > > > > > > vect_peel_nonlinear_iv_init.
> */
> > > > > > > > > > > > > > > >        else if (induction_type == 
> > > > > > > > > > > > > > > > vect_step_op_neg)
> > > > > > > > > > > > > > > >         ni = init_expr;
> > > > > > > > > > > > > > > > +      else if (restart_loop)
> > > > > > > > > > > > > > > > +       continue;
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > This looks all a bit complicated - why
> > > > > > > > > > > > > > > wouldn't we simply always use the PHI result
> > > > > > > > > > > > > > > when
> > 'restart_loop'?
> > > > > > > > > > > > > > > Isn't that the correct old start value in
> > > > > > > > > > > > > all cases?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >        else
> > > > > > > > > > > > > > > >         ni = vect_peel_nonlinear_iv_init
> > > > > > > > > > > > > > > > (&stmts,
> > init_expr,
> > > > > > > > > > > > > > > >                                           
> > > > > > > > > > > > > > > > niters,
> > step_expr,
> > > > > @@ -
> > > > > > > > > 2245,9 +2295,20 @@
> > > > > > > > > > > > > > > > vect_update_ivs_after_vectorizer
> > > > > > > > > > > > > > > (loop_vec_info
> > > > > > > > > > > > > > > > loop_vinfo,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >        var = create_tmp_var (type, "tmp");
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > -      last_gsi = gsi_last_bb (exit_bb);
> > > > > > > > > > > > > > > >        gimple_seq new_stmts = NULL;
> > > > > > > > > > > > > > > >        ni_name = force_gimple_operand (ni,
> > > > > > > > > > > > > > > > &new_stmts, false, var);
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +      /* For non-main exit create an
> > > > > > > > > > > > > > > > + intermediat edge to get any
> > > > > > > > > > > updated iv
> > > > > > > > > > > > > > > > +        calculations.  */
> > > > > > > > > > > > > > > > +      if (needs_interm_block
> > > > > > > > > > > > > > > > +         && !iv_block
> > > > > > > > > > > > > > > > +         && (!gimple_seq_empty_p (stmts) ||
> > > > > > > > > > > > > > > > +!gimple_seq_empty_p
> > > > > > > > > > > > > > > (new_stmts)))
> > > > > > > > > > > > > > > > +       {
> > > > > > > > > > > > > > > > +         iv_block = split_edge (update_e);
> > > > > > > > > > > > > > > > +         update_e = single_succ_edge (update_e-
> > >dest);
> > > > > > > > > > > > > > > > +         last_gsi = gsi_last_bb (iv_block);
> > > > > > > > > > > > > > > > +       }
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >        /* Exit_bb shouldn't be empty.  */
> > > > > > > > > > > > > > > >        if (!gsi_end_p (last_gsi))
> > > > > > > > > > > > > > > >         {
> > > > > > > > > > > > > > > > @@ -3342,8 +3403,26 @@ vect_do_peeling
> > > > > > > > > > > > > > > > (loop_vec_info loop_vinfo, tree
> > > > > > > > > > > > > > > niters, tree nitersm1,
> > > > > > > > > > > > > > > >          niters_vector_mult_vf steps.  */
> > > > > > > > > > > > > > > >        gcc_checking_assert
> > > > > > > > > > > > > > > > (vect_can_advance_ivs_p
> > > > > > > (loop_vinfo));
> > > > > > > > > > > > > > > >        update_e = skip_vector ? e :
> > > > > > > > > > > > > > > > loop_preheader_edge
> > > > > (epilog);
> > > > > > > > > > > > > > > > -      vect_update_ivs_after_vectorizer 
> > > > > > > > > > > > > > > > (loop_vinfo,
> > > > > > > > > > > niters_vector_mult_vf,
> > > > > > > > > > > > > > > > -                                       
> > > > > > > > > > > > > > > > update_e);
> > > > > > > > > > > > > > > > +      if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> > > > > > > > > > > > > > > > +       update_e = single_succ_edge (e->dest);
> > > > > > > > > > > > > > > > +      bool inversed_iv
> > > > > > > > > > > > > > > > +       = !vect_is_loop_exit_latch_pred
> > > > > > > (LOOP_VINFO_IV_EXIT
> > > > > > > > > > > (loop_vinfo),
> > > > > > > > > > > > > > > > +
> > LOOP_VINFO_LOOP
> > > > > > > > > > > (loop_vinfo));
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > You are computing this here and in
> > > > > > > > > vect_update_ivs_after_vectorizer?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +      /* Update the main exit first.  */
> > > > > > > > > > > > > > > > +      vect_update_ivs_after_vectorizer
> > > > > > > > > > > > > > > > + (loop_vinfo, vf,
> > > > > > > > > > > > > niters_vector_mult_vf,
> > > > > > > > > > > > > > > > +                                       
> > > > > > > > > > > > > > > > update_e,
> > > > > > > inversed_iv);
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +      /* And then update the early exits.  */
> > > > > > > > > > > > > > > > +      for (auto exit : get_loop_exit_edges 
> > > > > > > > > > > > > > > > (loop))
> > > > > > > > > > > > > > > > +       {
> > > > > > > > > > > > > > > > +         if (exit == LOOP_VINFO_IV_EXIT
> > (loop_vinfo))
> > > > > > > > > > > > > > > > +           continue;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +         vect_update_ivs_after_vectorizer
> > > > > > > > > > > > > > > > +(loop_vinfo, vf,
> > > > > > > > > > > > > > > > +
> > > > > > > niters_vector_mult_vf,
> > > > > > > > > > > > > > > > +                                           
> > > > > > > > > > > > > > > > exit, true);
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ... why does the same not work here?
> > > > > > > > > > > > > > > Wouldn't the proper condition be
> > > > > > > > > > > > > > > !dominated_by_p (CDI_DOMINATORS,
> > > > > > > > > > > > > > > exit->src, LOOP_VINFO_IV_EXIT
> > > > > > > > > > > > > > > (loop_vinfo)->src) or similar?  That is,
> > > > > > > > > > > > > > > whether the exit is at or after the main IV exit?
> > > > > > > > > > > > > > > (consider having
> > > > > > > > > > > > > > > two)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +       }
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >        if (skip_epilog)
> > > > > > > > > > > > > > > >         {
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Richard Biener <[email protected]> SUSE Software
> > > > > > > > > > > > > Solutions Germany GmbH, Frankenstrasse 146,
> > > > > > > > > > > > > 90461 Nuernberg, Germany;
> > > > > > > > > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich;
> > > > > > > > > > > > > (HRB 36809, AG
> > > > > > > > > > > > > Nuernberg)
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Richard Biener <[email protected]> SUSE Software
> > > > > > > > > > > Solutions Germany GmbH, Frankenstrasse 146, 90461
> > > > > > > > > > > Nuernberg, Germany;
> > > > > > > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich;
> > > > > > > > > > > (HRB 36809, AG
> > > > > > > > > > > Nuernberg)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Richard Biener <[email protected]> SUSE Software
> > > > > > > > > Solutions Germany GmbH, Frankenstrasse 146, 90461
> > > > > > > > > Nuernberg, Germany;
> > > > > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB
> > > > > > > > > 36809, AG
> > > > > > > > > Nuernberg)
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <[email protected]> SUSE Software Solutions
> > > > > > > Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809,
> > > > > > > AG
> > > > > > > Nuernberg)
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <[email protected]> SUSE Software Solutions
> > > > > Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > > > Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <[email protected]>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> > > Nuernberg, Germany;
> > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > Nuernberg)

RE: [PATCH 7/21]middle-end: update IV update code to support early breaks and arbitrary exits

Reply via email to