> -----Original Message-----
> From: Christophe Lyon <[email protected]>
> Sent: 27 January 2021 13:12
> To: Kyrylo Tkachov <[email protected]>
> Cc: Kyrylo Tkachov via Gcc-patches <[email protected]>
> Subject: Re: arm: Adjust cost of vector of constant zero
>
> On Wed, 27 Jan 2021 at 10:15, Kyrylo Tkachov <[email protected]>
> wrote:
> >
> > Hi Christophe,
> >
> > > -----Original Message-----
> > > From: Gcc-patches <[email protected]> On Behalf Of
> > > Christophe Lyon via Gcc-patches
> > > Sent: 26 January 2021 18:03
> > > To: gcc Patches <[email protected]>
> > > Subject: arm: Adjust cost of vector of constant zero
> > >
> > > Neon vector comparisons have a dedicated version when comparing with
> > > constant zero: it means its cost is free.
> > >
> > > Adjust the cost in arm_rtx_costs_internal accordingly, for Neon only,
> > > since MVE does not support this.
> >
> > I guess the other way to do this would be in the comparison code handling
> in this function where we could check for a const_vector of zeroes and a
> Neon mode and avoid recursing into the operands.
> > That would avoid the extra switch statement in your patch.
> > WDYT?
>
> Do you mean like so:
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 4a5f265..542c15e 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -11316,6 +11316,28 @@ arm_rtx_costs_internal (rtx x, enum rtx_code
> code, enum rtx_code outer_code,
> *cost = 0;
> return true;
> }
> + /* Neon has special instructions when comparing with 0 (vceq, vcge,
> vcgt,
> + vcle and vclt). */
> + else if (TARGET_NEON
> + && TARGET_HARD_FLOAT
> + && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE
> (mode))
> + && (XEXP (x, 1) == CONST0_RTX (mode)))
> + {
> + switch (code)
> + {
> + case EQ:
> + case GE:
> + case GT:
> + case LE:
> + case LT:
> + *cost = 0;
> + return true;
> +
> + default:
> + break;
> + }
> + }
> +
> return false;
>
> I'm not sure I can remove the switch, since the other comparisons are
> not supported by Neon anyway.
>
No, I mean where:
case EQ:
case NE:
case LT:
case LE:
case GT:
case GE:
case LTU:
case LEU:
case GEU:
case GTU:
case ORDERED:
case UNORDERED:
case UNEQ:
case UNLE:
case UNLT:
case UNGE:
case UNGT:
case LTGT:
if (outer_code == SET)
{
/* Is it a store-flag operation? */
if (REG_P (XEXP (x, 0)) && REGNO (XEXP (x, 0)) == CC_REGNUM
you reorder the codes that are relevant to NEON, handle them for a vector zero
argument (and the right target checks), and fall through to the rest if not.
Kyrill
> Thanks,
>
> Christophe
>
>
> > Thanks,
> > Kyrill
> >
> > >
> > > 2021-01-26 Christophe Lyon <[email protected]>
> > >
> > > gcc/
> > > PR target/98730
> > > * config/arm/arm.c (arm_rtx_costs_internal): Adjust cost of vector
> > > of constant zero for comparisons.
> > >
> > > gcc/testsuite/
> > > PR target/98730
> > > * gcc.target/arm/simd/vceqzq_p64.c: Update expected result.
> > >
> > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > > index 4a5f265..9c5c0df 100644
> > > --- a/gcc/config/arm/arm.c
> > > +++ b/gcc/config/arm/arm.c
> > > @@ -11544,7 +11544,28 @@ arm_rtx_costs_internal (rtx x, enum
> rtx_code
> > > code, enum rtx_code outer_code,
> > > && (VALID_NEON_DREG_MODE (mode) ||
> VALID_NEON_QREG_MODE
> > > (mode)))
> > > || TARGET_HAVE_MVE)
> > > && simd_immediate_valid_for_move (x, mode, NULL, NULL))
> > > - *cost = COSTS_N_INSNS (1);
> > > + {
> > > + *cost = COSTS_N_INSNS (1);
> > > +
> > > + /* Neon has special instructions when comparing with 0 (vceq, vcge,
> > > + vcgt, vcle and vclt). */
> > > + if (TARGET_NEON && (x == CONST0_RTX (mode)))
> > > + {
> > > + switch (outer_code)
> > > + {
> > > + case EQ:
> > > + case GE:
> > > + case GT:
> > > + case LE:
> > > + case LT:
> > > + *cost = COSTS_N_INSNS (0);
> > > + break;
> > > +
> > > + default:
> > > + break;
> > > + }
> > > + }
> > > + }
> > > else
> > > *cost = COSTS_N_INSNS (4);
> > > return true;
> > > diff --git a/gcc/testsuite/gcc.target/arm/simd/vceqzq_p64.c
> > > b/gcc/testsuite/gcc.target/arm/simd/vceqzq_p64.c
> > > index 640754c..a99bb8a 100644
> > > --- a/gcc/testsuite/gcc.target/arm/simd/vceqzq_p64.c
> > > +++ b/gcc/testsuite/gcc.target/arm/simd/vceqzq_p64.c
> > > @@ -15,4 +15,4 @@ void func()
> > > result2 = vceqzq_p64 (v2);
> > > }
> > >
> > > -/* { dg-final { scan-assembler-times "vceq\.i32\[
> > > \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 2 } } */
> > > +/* { dg-final { scan-assembler-times "vceq\.i32\[
> > > \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, #0\n" 2 } } */