On Tue, Dec 09, 2014 at 05:49:18PM +0800, Zhenqiang Chen wrote:
> > Do you need to verify SETA and SETB satisfy single_set?  Or has that
> > already been done elsewhere?
> 
> A is NEXT_INSN (insn)
> B is prev_nonnote_nondebug_insn (insn),
> 
> For I1 -> I2 -> B; I2 -> A;
> LOG_LINK can make sure I1 and I2 are single_set,

It cannot, not anymore anyway.  LOG_LINKs can point to an insn with multiple
SETs; multiple LOG_LINKs can point to such an insn.

The only thing a LOG_LINK from Y to X tells you is that Y is the earliest
insn after X that uses some register set by X (and it knows which register,
too, nowadays).

> > > +  if (DF_REG_USE_COUNT (REGNO (SET_SRC (setb))) > 2)
> > > +    {
> > > +      df_ref use;
> > > +      rtx insn;
> > > +      unsigned int i = REGNO (SET_SRC (setb));
> > > +
> > > +      for (use = DF_REG_USE_CHAIN (i); use; use = DF_REF_NEXT_REG
> (use))
> > > +        {
> > > +   insn = DF_REF_INSN (use);
> > > +   if (insn != a && insn != b && !(NOTE_P (insn) || DEBUG_INSN_P
> > (insn)))
> > > +     return false;
> > > + }
> > > +    }
> > > +
> > > +  return true;
> > > +}
> > Is this fragment really needed?  Does it ever trigger?  I'd think that
> > for > 2 uses punting would be fine.  Do we really commonly have cases
> > with > 2 uses, but where they're all in SETA and SETB?

Can't you just check for a death note on the second insn?  Together with
reg_used_between_p?

> > > +   /* Try to combine a compare insn that sets CC
> > > +      with a preceding insn that can set CC, and maybe with its
> > > +      logical predecessor as well.
> > > +      We need this special code because data flow connections
> > > +      do not get entered in LOG_LINKS.  */

I think you mean "not _all_ data flow connections"?

> > > +   if ((prev = prev_nonnote_nondebug_insn (insn)) != NULL_RTX
> > > +       && refer_same_reg_p (insn, prev)
> > > +       && max_combine >= 4)
> > > +     {
> > > +         struct insn_link *next1;
> > > +         FOR_EACH_LOG_LINK (next1, prev)
> > > +           {
> > > +             rtx_insn *link1 = next1->insn;
> > > +             if (NOTE_P (link1))
> > > +               continue;
> > > +             /* I1 -> I2 -> I3; I2 -> insn;
> > > +                output parallel (insn, I3).  */
> > > +             FOR_EACH_LOG_LINK (nextlinks, link1)
> > > +               if ((next = try_combine (prev, link1,
> > > +                                        nextlinks->insn, NULL,
> > > +                                        &new_direct_jump_p,
> > > +                                        last_combined_insn, insn)) !=
> 0)
> > > +
> > > +                 {
> > > +                   delete_insn (insn);
> > > +                   insn = next;
> > > +                   statistics_counter_event (cfun, "four-insn
> combine",
> > 1);
> > > +                   goto retry;
> > > +                 }
> > > +             /* I2 -> I3; I2 -> insn
> > > +                output next = parallel (insn, I3).  */
> > > +             if ((next = try_combine (prev, link1,
> > > +                                      NULL, NULL,
> > > +                                      &new_direct_jump_p,
> > > +                                      last_combined_insn, insn)) !=
> 0)
> > > +
> > > +               {
> > > +                 delete_insn (insn);
> > > +                 insn = next;
> > > +                 statistics_counter_event (cfun, "three-insn
> combine",
> > 1);
> > > +                 goto retry;
> > > +               }
> > > +           }
> > > +     }
> > So you've got two new combine cases here, but I think the testcase only
> > tests one of them.  Can you include a testcase for both of hte major
> > paths above (I1->I2->I3; I2->insn and I2->I3; I2->INSN)
> 
> pr61225.c is the case to cover I1->I2->I3; I2->insn.
> 
> For I2 -> I3; I2 -> insn, I tried my test cases and found peephole2 can also
> handle them. So I removed the code from the patch.

Why?  The simpler case has much better chances of being used.

In fact, there are many more cases you could handle:

You handle

I1 -> I2 -> I3; I2 -> insn
      I2 -> I3; I2 -> insn

but there are also

   I1,I2 -> I3; I2 -> insn

and the many 4-insn combos, too.
But that's not all: instead of just dealing with I2->insn, you can just as
well have I1->insn or I0->insn, and if you could handle the SET not dying
in the resulting insn, I3->insn.

In fact, in that case you really only need to handle I3->insn (no other
instructions involved), as a simple 2-insn combination that combines into
the earlier insn instead of the later, to get the effect you want.

Just like your patch, that would pull "insn" earlier, but it would do it
much more explicitly.


Some comments on the patch...

> +/* A is a compare (reg1, 0) and B is SINGLE_SET which SET_SRC is reg2.
> +   It returns TRUE, if reg1 == reg2, and no other refer of reg1
> +   except A and B.  */

That sound like the only correct inputs are such a compare etc., but the
routine tests whether that is true.

> +static bool
> +can_reuse_cc_set_p (rtx_insn *a, rtx_insn *b)
> +{
> +  rtx seta = single_set (a);
> +  rtx setb = single_set (b);
> +
> +  if (BLOCK_FOR_INSN (a) != BLOCK_FOR_INSN (b)

Neither the comment nor the function name mention this.  This test is
better placed in the caller of this function, anyway.

> +       /* Try to combine a compare insn that sets CC
> +          with a preceding insn that can set CC, and maybe with its
> +          logical predecessor as well.

If you do remove the I2->I3 case this comment needs updating.  If you
don't remove it, it should be tried before the I1->I2->I3 case.

> +   TO_COMBINED_INSN is an insn after I3 that sets CC flags.  It is used to
> +   combine with I3 to make a new insn.  */

"combine into I3", to make clear the resulting insn is placed at I3?

> @@ -3323,7 +3396,11 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn
> *i1, rtx_insn *i0,
>         rtx old = newpat;
>         total_sets = 1 + extra_sets;
>         newpat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (total_sets));
> -       XVECEXP (newpat, 0, 0) = old;
> +
> +       if (to_combined_insn)
> +         XVECEXP (newpat, 0, --total_sets) = old;
> +       else
> +         XVECEXP (newpat, 0, 0) = old;
>       }

Is this correct?  If so, it needs a big fat comment, because it is
not exactly obvious :-)

Also, it doesn't handle at all the case where the new pattern already is
a PARALLEL; can that never happen?

Cheers,


Segher

Reply via email to