> -----Original Message-----
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jeff Law
> Sent: Tuesday, December 02, 2014 6:11 AM
> To: Zhenqiang Chen
> Cc: Steven Bosscher; gcc-patches@gcc.gnu.org; Jakub Jelinek
> Subject: Re: [PATCH] Fix PR 61225
> 
> On 08/04/14 02:24, Zhenqiang Chen wrote:
> 
> >>>
> >>> ChangeLog:
> >>> 2014-05-22  Zhenqiang Chen  <zhenqiang.c...@linaro.org>
> >>>
> >>>           Part of PR rtl-optimization/61225
> >>>           * config/i386/i386-protos.h (ix86_peephole2_rtx_equal_p):
> >>> New proto.
> >>>           * config/i386/i386.c (ix86_peephole2_rtx_equal_p): New
function.
> >>>           * regcprop.c (replace_oldest_value_reg): Add REG_EQUAL note
> when
> >>>           propagating to SET.
> >>
> >> I can't help but wonder why the new 4 insn combination code isn't
> >> presenting this as a nice big fat insn to the x86 backend which would
> >> eliminate the need for the peep2.
> >>
> >> But, assuming there's a fundamental reason why that's not kicking in...
> >
> > Current combine pass can only handle
> >
> > I0 -> I1 -> I2 -> I3.
> > I0, I1 -> I2, I2 -> I3.
> > I0 -> I2; I1, I2 -> I3.
> > I0 -> I1; I1, I2 -> I3.
> >
> > For the case, it is
> > I1 -> I2 -> I3; I2 -> INSN
> >
> > I3 and INSN looks like not related. But INSN is a COMPARE to set CC
> > and I3 can also set CC. I3 and INSN can be combined together as one
> > instruction to set CC.
> Presumably there's no dataflow between I3 and INSN because they both set
> CC (doesn't that make them anti-dependent?
> 
> Can you show me the RTL corresponding to I1, I2, I3 and INSN, I simply
find it
> easier to look at RTL rather than guess why we don't have the appropriate
> linkage and thus not attempting the combinations we want.
> 
> Pseudo code for the resulting I3 and INSN would help -- as I work through
> this there's some inconsistencies in how I'm interpreting a few things and
RTL
> and pseudo-rtl for the desired output RTL would help a lot.

C code:

    if (!--*p)

rtl code:

    6: r91:SI=[r90:SI]
    7: {r88:SI=r91:SI-0x1;clobber flags:CC;}
    8: [r90:SI]=r88:SI
    9: flags:CCZ=cmp(r88:SI,0)

expected output:

    8: {flags:CCZ=cmp([r90:SI]-0x1,0);[r90:SI]=[r90:SI]-0x1;}

in assemble, it is

  decl (%eax)

> >
> > ChangeLog
> > 2014-08-04  Zhenqiang Chen  <zhenqiang.c...@linaro.org>
> >
> >          Part of PR rtl-optimization/61225
> >          * combine.c (refer_same_reg_p): New function.
> >          (combine_instructions): Handle I1 -> I2 -> I3; I2 -> insn.
> >          (try_combine): Add one more parameter TO_COMBINED_INSN, which
> is
> >          used to create a new insn parallel (TO_COMBINED_INSN, I3).
> >
> > testsuite/ChangeLog:
> > 2014-08-04  Zhenqiang Chen  <zhenqiang.c...@linaro.org>
> >
> >          * gcc.target/i386/pr61225.c: New test.
> >
> > diff --git a/gcc/combine.c b/gcc/combine.c index 53ac1d6..42098ab
> > 100644
> > --- a/gcc/combine.c
> > +++ b/gcc/combine.c
> > @@ -412,7 +412,7 @@ static int cant_combine_insn_p (rtx);
> >   static int can_combine_p (rtx, rtx, rtx, rtx, rtx, rtx, rtx *, rtx *);
> >   static int combinable_i3pat (rtx, rtx *, rtx, rtx, rtx, int, int, rtx
*);
> >   static int contains_muldiv (rtx);
> > -static rtx try_combine (rtx, rtx, rtx, rtx, int *, rtx);
> > +static rtx try_combine (rtx, rtx, rtx, rtx, int *, rtx, rtx);
> >   static void undo_all (void);
> >   static void undo_commit (void);
> >   static rtx *find_split_point (rtx *, rtx, bool); @@ -1099,6 +1099,46
> > @@ insn_a_feeds_b (rtx a, rtx b)
> >   #endif
> >     return false;
> >   }
> > +
> > +/* A is a compare (reg1, 0) and B is SINGLE_SET which SET_SRC is reg2.
> > +   It returns TRUE, if reg1 == reg2, and no other refer of reg1
> > +   except A and B.  */
> > +
> > +static bool
> > +refer_same_reg_p (rtx a, rtx b)
> > +{
> > +  rtx seta = single_set (a);
> > +  rtx setb = single_set (b);
> > +
> > +  if (BLOCK_FOR_INSN (a) != BLOCK_FOR_INSN (b)
> > +     || !seta || !setb)
> > +    return false;
> Go ahead and use
>          || !setb
>          || !setb
> 
> It's a bit more vertical space, but I believe closer in line with our
coding
> standards.

Updated.
 
> > +
> > +  if (GET_CODE (SET_SRC (seta)) != COMPARE
> > +      || GET_MODE_CLASS (GET_MODE (SET_DEST (seta))) != MODE_CC
> > +      || !REG_P (XEXP (SET_SRC (seta), 0))
> > +      || !const0_rtx
> > +      || !REG_P (SET_SRC (setb))
> > +      || REGNO (SET_SRC (setb)) != REGNO (XEXP (SET_SRC (seta), 0)))
> > +    return false;
> What's the !const0_rtx test here?  Don't you want to test some object
> from SETA against const0_rtx?  Also note that you may need to test
> against CONST0_RTX (mode)

It's my fault. It should be

XEXP (SET_SRC (seta), 1) != const0_rtx

The updated patch is attached.

Thanks!
-Zhenqiang

 > > @@ -1431,6 +1468,50 @@ combine_instructions (rtx f, unsigned int nregs)
> >                    }
> >                }
> >
> > +         /* Try to combine a compare insn that sets CC
> > +            with a preceding insn that can set CC, and maybe with its
> > +            logical predecessor as well.
> > +            We need this special code because data flow connections
> > +            do not get entered in LOG_LINKS.  */
> So you'd want to be more specific about what dataflow connections are
> not in the LOG_LINKS that we want.
> 
> It feels to me like we're missing the anti-dependence links on CC and
> that there's a general aspect to combine missing here.  But I want to
> hold off on final judgement until I know more.
> 
> I also wonder if compare-elim ought to be helping here.  Isn't that the
> point here, to eliminate the comparison and instead get it for free as
> part of the arithmetic?  If so, is it the fact that we have memory
> references that prevents compare-elim from kicking in?
> 
> jeff
> 

Attachment: pr61225.patch
Description: Binary data

Reply via email to