Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

Manolis Tsamis Tue, 23 Apr 2024 03:59:02 -0700

On Thu, Oct 19, 2023 at 10:41 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Manolis Tsamis <manolis.tsa...@vrull.eu> writes:
> > This is an extension of what was done in PR106590.
> >
> > Currently if a sequence generated in noce_convert_multiple_sets clobbers the
> > condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards
> > (sequences that emit the comparison itself). Since this applies only from 
> > the
> > next iteration it assumes that the sequences generated (in particular seq2)
> > doesn't clobber the condition rtx itself before using it in the 
> > if_then_else,
> > which is only true in specific cases (currently only register/subregister 
> > moves
> > are allowed).
> >
> > This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp 
> > in
> > the current iteration. This makes it possible to include arithmetic 
> > operations
> > in noce_convert_multiple_sets.
> >
> > gcc/ChangeLog:
> >
> >       * ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead.
> >       (noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp.
> >
> > Signed-off-by: Manolis Tsamis <manolis.tsa...@vrull.eu>
> > ---
> >
> > (no changes since v1)
> >
> >  gcc/ifcvt.cc | 49 +++++++++++++++++++------------------------------
> >  1 file changed, 19 insertions(+), 30 deletions(-)
>
> Sorry for the slow review.  TBH I was hoping someone else would pick
> it up, since (a) I'm not very familiar with this code, and (b) I don't
> really agree with the way that the current code works.  I'm not sure the
> current dependency checking is safe, so I'm nervous about adding even
> more cases to it.  And it feels like the different ifcvt techniques are
> not now well distinguished, so that they're beginning to overlap and
> compete with one another.  None of that is your fault, of course. :)
>
> > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> > index a0af553b9ff..3273aeca125 100644
> > --- a/gcc/ifcvt.cc
> > +++ b/gcc/ifcvt.cc
> > @@ -3375,20 +3375,6 @@ noce_convert_multiple_sets (struct noce_if_info 
> > *if_info)
> >    return true;
> >  }
> >
> > -/* Helper function for noce_convert_multiple_sets_1.  If store to
> > -   DEST can affect P[0] or P[1], clear P[0].  Called via note_stores.  */
> > -
> > -static void
> > -check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0)
> > -{
> > -  rtx *p = (rtx *) p0;
> > -  if (p[0] == NULL_RTX)
> > -    return;
> > -  if (reg_overlap_mentioned_p (dest, p[0])
> > -      || (p[1] && reg_overlap_mentioned_p (dest, p[1])))
> > -    p[0] = NULL_RTX;
> > -}
> > -
> >  /* This goes through all relevant insns of IF_INFO->then_bb and tries to
> >     create conditional moves.  In case a simple move sufficis the insn
> >     should be listed in NEED_NO_CMOV.  The rewired-src cases should be
> > @@ -3552,9 +3538,17 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
> > *if_info,
> >        creating an additional compare for each.  If successful, costing
> >        is easier and this sequence is usually preferred.  */
> >        if (cc_cmp)
> > -     seq2 = try_emit_cmove_seq (if_info, temp, cond,
> > -                                new_val, old_val, need_cmov,
> > -                                &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
> > +     {
> > +       seq2 = try_emit_cmove_seq (if_info, temp, cond,
> > +                                  new_val, old_val, need_cmov,
> > +                                  &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
> > +
> > +       /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_cmp 
> > is
> > +          clobbered.  We can't safely use the sequence in this case.  */
> > +       if (seq2 && (modified_in_p (cc_cmp, seq2)
> > +           || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq2))))
> > +         seq2 = NULL;
>
> modified_in_p only checks the first instruction in seq2, not the whole
> sequence.
>
> I think the unpatched approach is OK in cases where seq2 clobbers
> cc_cmp/rev_cc_cmp in or after the last use, since instructions are
> defined to operate on a read-all/compute/write-all basis.
>
> Soon after the snippet above, the unpatched code has this loop:
>
>       /* The backend might have created a sequence that uses the
>          condition.  Check this.  */
>       rtx_insn *walk = seq2;
>       while (walk)
>         {
>           rtx set = single_set (walk);
>
>           if (!set || !SET_SRC (set))
>
> This condition looks odd.  A SET_SRC is never null.  But more importantly,
> continuing means "assume the best", and I don't think we should assume
> the best for (say) an insn with two parallel sets.
>
> It doesn't look like the series addresses this, but !set seems more
> likely to occur if we extend the function to general operations.
>
>             {
>               walk = NEXT_INSN (walk);
>               continue;
>             }
>
>           rtx src = SET_SRC (set);
>
>           if (XEXP (set, 1) && GET_CODE (XEXP (set, 1)) == IF_THEN_ELSE)
>             ; /* We assume that this is the cmove created by the backend that
>                  naturally uses the condition.  Therefore we ignore it.  */
>
> XEXP (set, 1) is src.  But here too, I'm a bit nervous about assuming
> that any if_then_else is safe to ignore, even currently.  It seems
> more dangerous if we extend the function to handle more cases.
>
>           else
>             {
>               if (reg_mentioned_p (XEXP (cond, 0), src)
>                   || reg_mentioned_p (XEXP (cond, 1), src))
>                 {
>                   read_comparison = true;
>                   break;
>                 }
>             }
>
>           walk = NEXT_INSN (walk);
>         }
>
> Maybe a safer way to check whether "insn" is sensitive to the values
> in "val" is to use:
>
>   subrtx_iterator::array_type array;
>   FOR_EACH_SUBRTX (iter, array, val, NONCONST)
>     if (OBJECT_P (*iter) && reg_overlap_mentioned_p (*iter, insn))
>       return true;
>   return false;
>
> which doesn't assume that insn is a single set.
>
> But I'm not sure which cases this code is trying to catch.  Is it trying
> to catch cases where seq2 "spontaneously" uses registers that happen to
> overlap with cond?  If so, then when does that happen?  And if it does
> happen, wouldn't the sequence also have to set the registers first?
>
> If instead the code is trying to detect uses of cond that were fed
> to seq2 outside of cond itself, via try_emit_cmove_seq, then wouldn't
> it be enough to check old_val and new_val?
>
> Anyway, the point I was trying to get to was: we could use the
> FOR_EACH_SUBRTX above to check whether an instruction in seq2
> is sensitive to the value of cc_cmp/rev_cc_cmp.  And we can use
> modified_in_p, as in your patch, to check whether an instruction
> changes the value of cc_cmp/rev_cc_cmp.  We could therefore record
> whether we've seen a modification of cc_cmp/rev_cc_cmp, then reject
> seq2 if we see a subsequent use.
>
Hi Richard,


I have re-implemented this code based on all your suggestions and
resubmitted it as an RFC:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649898.html
The code is a bit more complicated but it handles a lot more cases.

On a side note, I also observed that the original code was sometimes
setting read_comparison = true more conservatively than needed due to
a limitation of reg_mentioned_p.
In the expression reg_mentioned_p (XEXP (cond, 1), src) both XEXP
(cond, 1) and src can be (const_int 1) and reg_mentioned_p will return
true, although it should return false. This is because it has a check

if (reg == in)
  return true;

Which will return true. If we look at reg_overlap_mentioned_p instead,
it has a check

if (CONSTANT_P (in))
  return false;

and I believe we need something similar for reg_mentioned_p.

Thanks,
Manolis

> Thanks,
> Richard
>
> > +     }
> >
> >        /* The backend might have created a sequence that uses the
> >        condition.  Check this.  */
> > @@ -3609,21 +3603,16 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
> > *if_info,
> >         return false;
> >       }
> >
> > -      if (cc_cmp)
> > +      if (cc_cmp && seq == seq1)
> >       {
> > -       /* Check if SEQ can clobber registers mentioned in
> > -          cc_cmp and/or rev_cc_cmp.  If yes, we need to use
> > -          only seq1 from that point on.  */
> > -       rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp };
> > -       for (walk = seq; walk; walk = NEXT_INSN (walk))
> > +       /* Check if SEQ can clobber registers mentioned in 
> > cc_cmp/rev_cc_cmp.
> > +          If yes, we need to use only seq1 from that point on.
> > +          Only check when we use seq1 since we have already tested seq2.  
> > */
> > +       if (modified_in_p (cc_cmp, seq)
> > +           || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq)))
> >           {
> > -           note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair);
> > -           if (cc_cmp_pair[0] == NULL_RTX)
> > -             {
> > -               cc_cmp = NULL_RTX;
> > -               rev_cc_cmp = NULL_RTX;
> > -               break;
> > -             }
> > +           cc_cmp = NULL_RTX;
> > +           rev_cc_cmp = NULL_RTX;
> >           }
> >       }

Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

Reply via email to