On Thu, Oct 19, 2023 at 10:41 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Manolis Tsamis <manolis.tsa...@vrull.eu> writes: > > This is an extension of what was done in PR106590. > > > > Currently if a sequence generated in noce_convert_multiple_sets clobbers the > > condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards > > (sequences that emit the comparison itself). Since this applies only from > > the > > next iteration it assumes that the sequences generated (in particular seq2) > > doesn't clobber the condition rtx itself before using it in the > > if_then_else, > > which is only true in specific cases (currently only register/subregister > > moves > > are allowed). > > > > This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp > > in > > the current iteration. This makes it possible to include arithmetic > > operations > > in noce_convert_multiple_sets. > > > > gcc/ChangeLog: > > > > * ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead. > > (noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp. > > > > Signed-off-by: Manolis Tsamis <manolis.tsa...@vrull.eu> > > --- > > > > (no changes since v1) > > > > gcc/ifcvt.cc | 49 +++++++++++++++++++------------------------------ > > 1 file changed, 19 insertions(+), 30 deletions(-) > > Sorry for the slow review. TBH I was hoping someone else would pick > it up, since (a) I'm not very familiar with this code, and (b) I don't > really agree with the way that the current code works. I'm not sure the > current dependency checking is safe, so I'm nervous about adding even > more cases to it. And it feels like the different ifcvt techniques are > not now well distinguished, so that they're beginning to overlap and > compete with one another. None of that is your fault, of course. :) > > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > > index a0af553b9ff..3273aeca125 100644 > > --- a/gcc/ifcvt.cc > > +++ b/gcc/ifcvt.cc > > @@ -3375,20 +3375,6 @@ noce_convert_multiple_sets (struct noce_if_info > > *if_info) > > return true; > > } > > > > -/* Helper function for noce_convert_multiple_sets_1. If store to > > - DEST can affect P[0] or P[1], clear P[0]. Called via note_stores. */ > > - > > -static void > > -check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0) > > -{ > > - rtx *p = (rtx *) p0; > > - if (p[0] == NULL_RTX) > > - return; > > - if (reg_overlap_mentioned_p (dest, p[0]) > > - || (p[1] && reg_overlap_mentioned_p (dest, p[1]))) > > - p[0] = NULL_RTX; > > -} > > - > > /* This goes through all relevant insns of IF_INFO->then_bb and tries to > > create conditional moves. In case a simple move sufficis the insn > > should be listed in NEED_NO_CMOV. The rewired-src cases should be > > @@ -3552,9 +3538,17 @@ noce_convert_multiple_sets_1 (struct noce_if_info > > *if_info, > > creating an additional compare for each. If successful, costing > > is easier and this sequence is usually preferred. */ > > if (cc_cmp) > > - seq2 = try_emit_cmove_seq (if_info, temp, cond, > > - new_val, old_val, need_cmov, > > - &cost2, &temp_dest2, cc_cmp, rev_cc_cmp); > > + { > > + seq2 = try_emit_cmove_seq (if_info, temp, cond, > > + new_val, old_val, need_cmov, > > + &cost2, &temp_dest2, cc_cmp, rev_cc_cmp); > > + > > + /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_cmp > > is > > + clobbered. We can't safely use the sequence in this case. */ > > + if (seq2 && (modified_in_p (cc_cmp, seq2) > > + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq2)))) > > + seq2 = NULL; > > modified_in_p only checks the first instruction in seq2, not the whole > sequence. > > I think the unpatched approach is OK in cases where seq2 clobbers > cc_cmp/rev_cc_cmp in or after the last use, since instructions are > defined to operate on a read-all/compute/write-all basis. > > Soon after the snippet above, the unpatched code has this loop: > > /* The backend might have created a sequence that uses the > condition. Check this. */ > rtx_insn *walk = seq2; > while (walk) > { > rtx set = single_set (walk); > > if (!set || !SET_SRC (set)) > > This condition looks odd. A SET_SRC is never null. But more importantly, > continuing means "assume the best", and I don't think we should assume > the best for (say) an insn with two parallel sets. > > It doesn't look like the series addresses this, but !set seems more > likely to occur if we extend the function to general operations. > > { > walk = NEXT_INSN (walk); > continue; > } > > rtx src = SET_SRC (set); > > if (XEXP (set, 1) && GET_CODE (XEXP (set, 1)) == IF_THEN_ELSE) > ; /* We assume that this is the cmove created by the backend that > naturally uses the condition. Therefore we ignore it. */ > > XEXP (set, 1) is src. But here too, I'm a bit nervous about assuming > that any if_then_else is safe to ignore, even currently. It seems > more dangerous if we extend the function to handle more cases. > > else > { > if (reg_mentioned_p (XEXP (cond, 0), src) > || reg_mentioned_p (XEXP (cond, 1), src)) > { > read_comparison = true; > break; > } > } > > walk = NEXT_INSN (walk); > } > > Maybe a safer way to check whether "insn" is sensitive to the values > in "val" is to use: > > subrtx_iterator::array_type array; > FOR_EACH_SUBRTX (iter, array, val, NONCONST) > if (OBJECT_P (*iter) && reg_overlap_mentioned_p (*iter, insn)) > return true; > return false; > > which doesn't assume that insn is a single set. > > But I'm not sure which cases this code is trying to catch. Is it trying > to catch cases where seq2 "spontaneously" uses registers that happen to > overlap with cond? If so, then when does that happen? And if it does > happen, wouldn't the sequence also have to set the registers first? > > If instead the code is trying to detect uses of cond that were fed > to seq2 outside of cond itself, via try_emit_cmove_seq, then wouldn't > it be enough to check old_val and new_val? > > Anyway, the point I was trying to get to was: we could use the > FOR_EACH_SUBRTX above to check whether an instruction in seq2 > is sensitive to the value of cc_cmp/rev_cc_cmp. And we can use > modified_in_p, as in your patch, to check whether an instruction > changes the value of cc_cmp/rev_cc_cmp. We could therefore record > whether we've seen a modification of cc_cmp/rev_cc_cmp, then reject > seq2 if we see a subsequent use. > Hi Richard,
I have re-implemented this code based on all your suggestions and resubmitted it as an RFC: https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649898.html The code is a bit more complicated but it handles a lot more cases. On a side note, I also observed that the original code was sometimes setting read_comparison = true more conservatively than needed due to a limitation of reg_mentioned_p. In the expression reg_mentioned_p (XEXP (cond, 1), src) both XEXP (cond, 1) and src can be (const_int 1) and reg_mentioned_p will return true, although it should return false. This is because it has a check if (reg == in) return true; Which will return true. If we look at reg_overlap_mentioned_p instead, it has a check if (CONSTANT_P (in)) return false; and I believe we need something similar for reg_mentioned_p. Thanks, Manolis > Thanks, > Richard > > > + } > > > > /* The backend might have created a sequence that uses the > > condition. Check this. */ > > @@ -3609,21 +3603,16 @@ noce_convert_multiple_sets_1 (struct noce_if_info > > *if_info, > > return false; > > } > > > > - if (cc_cmp) > > + if (cc_cmp && seq == seq1) > > { > > - /* Check if SEQ can clobber registers mentioned in > > - cc_cmp and/or rev_cc_cmp. If yes, we need to use > > - only seq1 from that point on. */ > > - rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp }; > > - for (walk = seq; walk; walk = NEXT_INSN (walk)) > > + /* Check if SEQ can clobber registers mentioned in > > cc_cmp/rev_cc_cmp. > > + If yes, we need to use only seq1 from that point on. > > + Only check when we use seq1 since we have already tested seq2. > > */ > > + if (modified_in_p (cc_cmp, seq) > > + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq))) > > { > > - note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair); > > - if (cc_cmp_pair[0] == NULL_RTX) > > - { > > - cc_cmp = NULL_RTX; > > - rev_cc_cmp = NULL_RTX; > > - break; > > - } > > + cc_cmp = NULL_RTX; > > + rev_cc_cmp = NULL_RTX; > > } > > }