> -----Original Message-----
> From: Richard Biener <rguent...@suse.de>
> Sent: Monday, November 13, 2023 7:09 AM
> To: Andrew Pinski <pins...@gmail.com>
> Cc: Tamar Christina <tamar.christ...@arm.com>; Prathamesh Kulkarni
> <prathamesh.kulka...@linaro.org>; gcc-patches@gcc.gnu.org; nd
> <n...@arm.com>; j...@ventanamicro.com
> Subject: Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to
> copysign (x, -1) [PR109154]
> 
> On Fri, 10 Nov 2023, Andrew Pinski wrote:
> 
> > On Fri, Nov 10, 2023 at 5:12?AM Richard Biener <rguent...@suse.de>
> wrote:
> > >
> > > On Fri, 10 Nov 2023, Tamar Christina wrote:
> > >
> > > >
> > > > Hi Prathamesh,
> > > >
> > > > Yes Arm requires SIMD for copysign. The testcases fail because they 
> > > > don't
> turn on Neon.
> > > >
> > > > I'll update them.
> > >
> > > On x86_64 with -m32 I see
> > >
> > > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized ".COPYSIGN"
> > > 1
> > > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized "ABS_EXPR" 1
> > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "=
> ABS_EXPR"
> > > 1
> > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= -" 1
> > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "=
> .COPYSIGN"
> > > 2
> > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > > "Deleting[^\\\\n]* = -" 4
> > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > > "Deleting[^\\\\n]* = \\\\.COPYSIGN" 2
> > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > > "Deleting[^\\\\n]* = ABS_EXPR <" 1
> > > FAIL: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if"
> > >
> > > maybe add a copysign effective target?
> >
> > I get the feeling that the internal function for copysign should not
> > be a direct internal function for scalar modes and call
> > expand_copysign instead when expanding.
> > This will fix some if not all of the issues where COPYSIGN is now
> > trying to show up.
> 
> But then I'd rather have a COPYSIGN_EXPR tree code, leaving internal-fns to
> optab mappings.  We've discussed this and discarded any of this as too much
> work right now.

I have a patch mostly written for next stage1 that will allow IFNs to have a 
fallback
Target hook as an option.  This would then allow us to remove things like 
XORSIGN
as well.  Atm it's just a prototype and I don't have time to finish it before 
stage1 ends
but it cleans things up a lot in targets that don't use the copysign RTX code.

Regards,
Tamar
> 
> But yes, the situation is a bit messy (as also discussed).
> 
> Richard.
> 
> > BY the way this is most likely PR 88786 (and PR 112468 and a few
> > others). and PR 58797 .
> >
> > Thanks,
> > Andrew
> >
> >
> >
> > >
> > > > Regards,
> > > > Tamar
> > > > ________________________________
> > > > From: Prathamesh Kulkarni <prathamesh.kulka...@linaro.org>
> > > > Sent: Friday, November 10, 2023 12:24 PM
> > > > To: Tamar Christina <tamar.christ...@arm.com>
> > > > Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd
> > > > <n...@arm.com>; rguent...@suse.de <rguent...@suse.de>;
> > > > j...@ventanamicro.com <j...@ventanamicro.com>
> > > > Subject: Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg
> > > > (fabs (x)) to copysign (x, -1) [PR109154]
> > > >
> > > > On Mon, 6 Nov 2023 at 15:50, Tamar Christina
> <tamar.christ...@arm.com> wrote:
> > > > >
> > > > > Hi All,
> > > > >
> > > > > This patch transforms fneg (fabs (x)) into copysign (x, -1)
> > > > > which is more canonical and allows a target to expand this
> > > > > sequence efficiently.  Such sequences are common in scientific code
> working with gradients.
> > > > >
> > > > > There is an existing canonicalization of copysign (x, -1) to
> > > > > fneg (fabs (x)) which I remove since this is a less efficient
> > > > > form.  The testsuite is also updated in light of this.
> > > > >
> > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > > Hi Tamar,
> > > > It seems the patch caused following regressions on arm:
> > > >
> > > > Running gcc:gcc.dg/dg.exp ...
> > > > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized
> > > > ".COPYSIGN" 1
> > > > FAIL: gcc.dg/pr55152-2.c scan-tree-dump-times optimized "ABS_EXPR"
> > > > 1
> > > >
> > > > Running gcc:gcc.dg/tree-ssa/tree-ssa.exp ...
> > > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "= -"
> > > > 1
> > > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "=
> > > > .COPYSIGN" 2
> > > > FAIL: gcc.dg/tree-ssa/abs-4.c scan-tree-dump-times optimized "=
> > > > ABS_EXPR" 1
> > > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > > > "Deleting[^\\n]* = -" 4
> > > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > > > "Deleting[^\\n]* = ABS_EXPR <" 1
> > > > FAIL: gcc.dg/tree-ssa/backprop-6.c scan-tree-dump-times backprop
> > > > "Deleting[^\\n]* = \\.COPYSIGN" 2
> > > > FAIL: gcc.dg/tree-ssa/copy-sign-2.c scan-tree-dump-times optimized
> > > > ".COPYSIGN" 1
> > > > FAIL: gcc.dg/tree-ssa/copy-sign-2.c scan-tree-dump-times optimized
> > > > "ABS" 1
> > > > FAIL: gcc.dg/tree-ssa/mult-abs-2.c scan-tree-dump-times gimple
> > > > ".COPYSIGN" 4
> > > > FAIL: gcc.dg/tree-ssa/mult-abs-2.c scan-tree-dump-times gimple
> > > > "ABS" 4
> > > > FAIL: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if"
> > > > Link to log files:
> > > > https://ci.linaro.org/job/tcwg_gcc_check--master-arm-build/1240/ar
> > > > tifact/artifacts/00-sumfiles/
> > > >
> > > > Even for following test-case:
> > > > double g (double a)
> > > > {
> > > >   double t1 = fabs (a);
> > > >   double t2 = -t1;
> > > >   return t2;
> > > > }
> > > >
> > > > It seems, the pattern gets applied but doesn't get eventually
> > > > simplified to copysign(a, -1).
> > > > forwprop dump shows:
> > > > Applying pattern match.pd:1131, gimple-match-4.cc:4134 double g
> > > > (double a) {
> > > >   double t2;
> > > >   double t1;
> > > >
> > > >   <bb 2> :
> > > >   t1_2 = ABS_EXPR <a_1(D)>;
> > > >   t2_3 = -t1_2;
> > > >   return t2_3;
> > > >
> > > > }
> > > >
> > > > while on x86_64:
> > > > Applying pattern match.pd:1131, gimple-match-4.cc:4134
> > > > gimple_simplified to t2_3 = .COPYSIGN (a_1(D), -1.0e+0); Removing
> > > > dead stmt:t1_2 = ABS_EXPR <a_1(D)>; double g (double a) {
> > > >   double t2;
> > > >   double t1;
> > > >
> > > >   <bb 2> :
> > > >   t2_3 = .COPYSIGN (a_1(D), -1.0e+0);
> > > >   return t2_3;
> > > >
> > > > }
> > > >
> > > > Thanks,
> > > > Prathamesh
> > > >
> > > >
> > > > >
> > > > > Ok for master?
> > > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >         PR tree-optimization/109154
> > > > >         * match.pd: Add new neg+abs rule, remove inverse copysign 
> > > > > rule.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >         PR tree-optimization/109154
> > > > >         * gcc.dg/fold-copysign-1.c: Updated.
> > > > >         * gcc.dg/pr55152-2.c: Updated.
> > > > >         * gcc.dg/tree-ssa/abs-4.c: Updated.
> > > > >         * gcc.dg/tree-ssa/backprop-6.c: Updated.
> > > > >         * gcc.dg/tree-ssa/copy-sign-2.c: Updated.
> > > > >         * gcc.dg/tree-ssa/mult-abs-2.c: Updated.
> > > > >         * gcc.target/aarch64/fneg-abs_1.c: New test.
> > > > >         * gcc.target/aarch64/fneg-abs_2.c: New test.
> > > > >         * gcc.target/aarch64/fneg-abs_3.c: New test.
> > > > >         * gcc.target/aarch64/fneg-abs_4.c: New test.
> > > > >         * gcc.target/aarch64/sve/fneg-abs_1.c: New test.
> > > > >         * gcc.target/aarch64/sve/fneg-abs_2.c: New test.
> > > > >         * gcc.target/aarch64/sve/fneg-abs_3.c: New test.
> > > > >         * gcc.target/aarch64/sve/fneg-abs_4.c: New test.
> > > > >
> > > > > --- inline copy of patch --
> > > > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > > > >
> db95931df0672cf4ef08cca36085c3aa6831519e..7a023d510c283c43a87b1
> 7
> > > > > 95a74761b8af979b53 100644
> > > > > --- a/gcc/match.pd
> > > > > +++ b/gcc/match.pd
> > > > > @@ -1106,13 +1106,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN
> (RINT)
> > > > >     (hypots @0 (copysigns @1 @2))
> > > > >     (hypots @0 @1))))
> > > > >
> > > > > -/* copysign(x, CST) -> [-]abs (x).  */ -(for copysigns
> > > > > (COPYSIGN_ALL)
> > > > > - (simplify
> > > > > -  (copysigns @0 REAL_CST@1)
> > > > > -  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
> > > > > -   (negate (abs @0))
> > > > > -   (abs @0))))
> > > > > +/* Transform fneg (fabs (X)) -> copysign (X, -1).  */
> > > > > +
> > > > > +(simplify
> > > > > + (negate (abs @0))
> > > > > + (IFN_COPYSIGN @0 { build_minus_one_cst (type); }))
> > > > >
> > > > >  /* copysign(copysign(x, y), z) -> copysign(x, z).  */  (for
> > > > > copysigns (COPYSIGN_ALL) diff --git
> > > > > a/gcc/testsuite/gcc.dg/fold-copysign-1.c
> > > > > b/gcc/testsuite/gcc.dg/fold-copysign-1.c
> > > > > index
> > > > >
> f17d65c24ee4dca9867827d040fe0a404c515e7b..f9cafd14ab05f5e8ab2f6f
> > > > > 68e62801d21c2df6a6 100644
> > > > > --- a/gcc/testsuite/gcc.dg/fold-copysign-1.c
> > > > > +++ b/gcc/testsuite/gcc.dg/fold-copysign-1.c
> > > > > @@ -12,5 +12,5 @@ double bar (double x)
> > > > >    return __builtin_copysign (x, minuszero);  }
> > > > >
> > > > > -/* { dg-final { scan-tree-dump-times "= -" 1 "cddce1" } } */
> > > > > -/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 2 "cddce1" }
> > > > > } */
> > > > > +/* { dg-final { scan-tree-dump-times "__builtin_copysign" 1
> > > > > +"cddce1" } } */
> > > > > +/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 1 "cddce1" }
> > > > > +} */
> > > > > diff --git a/gcc/testsuite/gcc.dg/pr55152-2.c
> > > > > b/gcc/testsuite/gcc.dg/pr55152-2.c
> > > > > index
> > > > >
> 54db0f2062da105a829d6690ac8ed9891fe2b588..605f202ed6bc7aa8fe921
> 4
> > > > > 57b02ff0b88cc63ce6 100644
> > > > > --- a/gcc/testsuite/gcc.dg/pr55152-2.c
> > > > > +++ b/gcc/testsuite/gcc.dg/pr55152-2.c
> > > > > @@ -10,4 +10,5 @@ int f(int a)
> > > > >    return (a<-a)?a:-a;
> > > > >  }
> > > > >
> > > > > -/* { dg-final { scan-tree-dump-times "ABS_EXPR" 2 "optimized" }
> > > > > } */
> > > > > +/* { dg-final { scan-tree-dump-times "\.COPYSIGN" 1 "optimized"
> > > > > +} } */
> > > > > +/* { dg-final { scan-tree-dump-times "ABS_EXPR" 1 "optimized" }
> > > > > +} */
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
> > > > > index
> > > > >
> 6197519faf7b55aed7bc162cd0a14dd2145210ca..e1b825f37f69ac3c4666b
> 3
> > > > > a52d733368805ad31d 100644
> > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
> > > > > @@ -9,5 +9,6 @@ long double abs_ld(long double x) { return
> > > > > __builtin_signbit(x) ? x : -x; }
> > > > >
> > > > >  /* __builtin_signbit(x) ? x : -x. Should be convert into -
> > > > > ABS_EXP<x> */
> > > > >  /* { dg-final { scan-tree-dump-not "signbit" "optimized"} } */
> > > > > -/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 3
> > > > > "optimized"} } */
> > > > > -/* { dg-final { scan-tree-dump-times "= -" 3 "optimized"} } */
> > > > > +/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 1
> > > > > +"optimized"} } */
> > > > > +/* { dg-final { scan-tree-dump-times "= -" 1 "optimized"} } */
> > > > > +/* { dg-final { scan-tree-dump-times "= \.COPYSIGN" 2
> > > > > +"optimized"} } */
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
> > > > > index
> > > > >
> 31f05716f1498dc709cac95fa20fb5796642c77e..c3a138642d6ff7be984e91
> > > > > fa1343cb2718db7ae1 100644
> > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
> > > > > @@ -26,5 +26,6 @@ TEST_FUNCTION (float, f)  TEST_FUNCTION
> > > > > (double, )  TEST_FUNCTION (long double, l)
> > > > >
> > > > > -/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 6
> > > > > "backprop" } } */
> > > > > -/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR
> > > > > <} 3 "backprop" } } */
> > > > > +/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 4
> > > > > +"backprop" } } */
> > > > > +/* { dg-final { scan-tree-dump-times {Deleting[^\n]* =
> > > > > +\.COPYSIGN} 2 "backprop" } } */
> > > > > +/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR
> > > > > +<} 1 "backprop" } } */
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c
> > > > > index
> > > > >
> de52c5f7c8062958353d91f5031193defc9f3f91..e5d565c4b9832c0010658
> 8
> > > > > ef411fbd8c292a5cad 100644
> > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c
> > > > > @@ -10,4 +10,5 @@ float f1(float x)
> > > > >    float t = __builtin_copysignf (1.0f, -x);
> > > > >    return x * t;
> > > > >  }
> > > > > -/* { dg-final { scan-tree-dump-times "ABS" 2 "optimized"} } */
> > > > > +/* { dg-final { scan-tree-dump-times "ABS" 1 "optimized"} } */
> > > > > +/* { dg-final { scan-tree-dump-times ".COPYSIGN" 1 "optimized"}
> > > > > +} */
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c
> > > > > index
> > > > >
> a41f1baf25669a4fd301a586a49ba5e3c5b966ab..a22896b21c8b5a4d5d8e
> 28
> > > > > bd8ae0db896e63ade0 100644
> > > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c
> > > > > @@ -34,4 +34,5 @@ float i1(float x)  {
> > > > >    return x * (x <= 0.f ? 1.f : -1.f);  }
> > > > > -/* { dg-final { scan-tree-dump-times "ABS" 8 "gimple"} } */
> > > > > +/* { dg-final { scan-tree-dump-times "ABS" 4 "gimple"} } */
> > > > > +/* { dg-final { scan-tree-dump-times "\.COPYSIGN" 4 "gimple"} }
> > > > > +*/
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c
> > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> 0000000000000000000000000000000000000000..f823013c3ddf6b3a266
> c3a
> > > > > bfcbf2642fc2a75fa6
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c
> > > > > @@ -0,0 +1,39 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3" } */
> > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64
> > > > > +} } } */
> > > > > +
> > > > > +#pragma GCC target "+nosve"
> > > > > +
> > > > > +#include <arm_neon.h>
> > > > > +
> > > > > +/*
> > > > > +** t1:
> > > > > +**     orr     v[0-9]+.2s, #128, lsl #24
> > > > > +**     ret
> > > > > +*/
> > > > > +float32x2_t t1 (float32x2_t a)
> > > > > +{
> > > > > +  return vneg_f32 (vabs_f32 (a)); }
> > > > > +
> > > > > +/*
> > > > > +** t2:
> > > > > +**     orr     v[0-9]+.4s, #128, lsl #24
> > > > > +**     ret
> > > > > +*/
> > > > > +float32x4_t t2 (float32x4_t a)
> > > > > +{
> > > > > +  return vnegq_f32 (vabsq_f32 (a)); }
> > > > > +
> > > > > +/*
> > > > > +** t3:
> > > > > +**     adrp    x0, .LC[0-9]+
> > > > > +**     ldr     q[0-9]+, \[x0, #:lo12:.LC0\]
> > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b
> > > > > +**     ret
> > > > > +*/
> > > > > +float64x2_t t3 (float64x2_t a)
> > > > > +{
> > > > > +  return vnegq_f64 (vabsq_f64 (a)); }
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c
> > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> 0000000000000000000000000000000000000000..141121176b309e4b2a
> a413
> > > > > dc55271a6e3c93d5e1
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c
> > > > > @@ -0,0 +1,31 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3" } */
> > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64
> > > > > +} } } */
> > > > > +
> > > > > +#pragma GCC target "+nosve"
> > > > > +
> > > > > +#include <arm_neon.h>
> > > > > +#include <math.h>
> > > > > +
> > > > > +/*
> > > > > +** f1:
> > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24
> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> > > > > +**     ret
> > > > > +*/
> > > > > +float32_t f1 (float32_t a)
> > > > > +{
> > > > > +  return -fabsf (a);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > +** f2:
> > > > > +**     mov     x0, -9223372036854775808
> > > > > +**     fmov    d[0-9]+, x0
> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> > > > > +**     ret
> > > > > +*/
> > > > > +float64_t f2 (float64_t a)
> > > > > +{
> > > > > +  return -fabs (a);
> > > > > +}
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c
> > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> 0000000000000000000000000000000000000000..b4652173a95d104ddf
> a70c
> > > > > 497f0627a61ea89d3b
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c
> > > > > @@ -0,0 +1,36 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3" } */
> > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64
> > > > > +} } } */
> > > > > +
> > > > > +#pragma GCC target "+nosve"
> > > > > +
> > > > > +#include <arm_neon.h>
> > > > > +#include <math.h>
> > > > > +
> > > > > +/*
> > > > > +** f1:
> > > > > +**     ...
> > > > > +**     ldr     q[0-9]+, \[x0\]
> > > > > +**     orr     v[0-9]+.4s, #128, lsl #24
> > > > > +**     str     q[0-9]+, \[x0\], 16
> > > > > +**     ...
> > > > > +*/
> > > > > +void f1 (float32_t *a, int n)
> > > > > +{
> > > > > +  for (int i = 0; i < (n & -8); i++)
> > > > > +   a[i] = -fabsf (a[i]);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > +** f2:
> > > > > +**     ...
> > > > > +**     ldr     q[0-9]+, \[x0\]
> > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b
> > > > > +**     str     q[0-9]+, \[x0\], 16
> > > > > +**     ...
> > > > > +*/
> > > > > +void f2 (float64_t *a, int n)
> > > > > +{
> > > > > +  for (int i = 0; i < (n & -8); i++)
> > > > > +   a[i] = -fabs (a[i]);
> > > > > +}
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c
> > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> 0000000000000000000000000000000000000000..10879dea74462d34b2
> 6160
> > > > > eeb0bd54ead063166b
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c
> > > > > @@ -0,0 +1,39 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3" } */
> > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64
> > > > > +} } } */
> > > > > +
> > > > > +#pragma GCC target "+nosve"
> > > > > +
> > > > > +#include <string.h>
> > > > > +
> > > > > +/*
> > > > > +** negabs:
> > > > > +**     mov     x0, -9223372036854775808
> > > > > +**     fmov    d[0-9]+, x0
> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> > > > > +**     ret
> > > > > +*/
> > > > > +double negabs (double x)
> > > > > +{
> > > > > +   unsigned long long y;
> > > > > +   memcpy (&y, &x, sizeof(double));
> > > > > +   y = y | (1UL << 63);
> > > > > +   memcpy (&x, &y, sizeof(double));
> > > > > +   return x;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > +** negabsf:
> > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24
> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> > > > > +**     ret
> > > > > +*/
> > > > > +float negabsf (float x)
> > > > > +{
> > > > > +   unsigned int y;
> > > > > +   memcpy (&y, &x, sizeof(float));
> > > > > +   y = y | (1U << 31);
> > > > > +   memcpy (&x, &y, sizeof(float));
> > > > > +   return x;
> > > > > +}
> > > > > +
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c
> > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> 0000000000000000000000000000000000000000..0c7664e6de77a49768
> 2952
> > > > > 653ffd417453854d52
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c
> > > > > @@ -0,0 +1,37 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3" } */
> > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64
> > > > > +} } } */
> > > > > +
> > > > > +#include <arm_neon.h>
> > > > > +
> > > > > +/*
> > > > > +** t1:
> > > > > +**     orr     v[0-9]+.2s, #128, lsl #24
> > > > > +**     ret
> > > > > +*/
> > > > > +float32x2_t t1 (float32x2_t a)
> > > > > +{
> > > > > +  return vneg_f32 (vabs_f32 (a)); }
> > > > > +
> > > > > +/*
> > > > > +** t2:
> > > > > +**     orr     v[0-9]+.4s, #128, lsl #24
> > > > > +**     ret
> > > > > +*/
> > > > > +float32x4_t t2 (float32x4_t a)
> > > > > +{
> > > > > +  return vnegq_f32 (vabsq_f32 (a)); }
> > > > > +
> > > > > +/*
> > > > > +** t3:
> > > > > +**     adrp    x0, .LC[0-9]+
> > > > > +**     ldr     q[0-9]+, \[x0, #:lo12:.LC0\]
> > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b
> > > > > +**     ret
> > > > > +*/
> > > > > +float64x2_t t3 (float64x2_t a)
> > > > > +{
> > > > > +  return vnegq_f64 (vabsq_f64 (a)); }
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c
> > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> 0000000000000000000000000000000000000000..a60cd31b9294af2dac6
> 9ee
> > > > > d1c93f899bd5c78fca
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c
> > > > > @@ -0,0 +1,29 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3" } */
> > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64
> > > > > +} } } */
> > > > > +
> > > > > +#include <arm_neon.h>
> > > > > +#include <math.h>
> > > > > +
> > > > > +/*
> > > > > +** f1:
> > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24
> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> > > > > +**     ret
> > > > > +*/
> > > > > +float32_t f1 (float32_t a)
> > > > > +{
> > > > > +  return -fabsf (a);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > +** f2:
> > > > > +**     mov     x0, -9223372036854775808
> > > > > +**     fmov    d[0-9]+, x0
> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> > > > > +**     ret
> > > > > +*/
> > > > > +float64_t f2 (float64_t a)
> > > > > +{
> > > > > +  return -fabs (a);
> > > > > +}
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c
> > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> 0000000000000000000000000000000000000000..1bf34328d8841de8e6
> b0a5
> > > > > 458562a9f00e31c275
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c
> > > > > @@ -0,0 +1,34 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3" } */
> > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64
> > > > > +} } } */
> > > > > +
> > > > > +#include <arm_neon.h>
> > > > > +#include <math.h>
> > > > > +
> > > > > +/*
> > > > > +** f1:
> > > > > +**     ...
> > > > > +**     ld1w    z[0-9]+.s, p[0-9]+/z, \[x0, x2, lsl 2\]
> > > > > +**     orr     z[0-9]+.s, z[0-9]+.s, #0x80000000
> > > > > +**     st1w    z[0-9]+.s, p[0-9]+, \[x0, x2, lsl 2\]
> > > > > +**     ...
> > > > > +*/
> > > > > +void f1 (float32_t *a, int n)
> > > > > +{
> > > > > +  for (int i = 0; i < (n & -8); i++)
> > > > > +   a[i] = -fabsf (a[i]);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > +** f2:
> > > > > +**     ...
> > > > > +**     ld1d    z[0-9]+.d, p[0-9]+/z, \[x0, x2, lsl 3\]
> > > > > +**     orr     z[0-9]+.d, z[0-9]+.d, #0x8000000000000000
> > > > > +**     st1d    z[0-9]+.d, p[0-9]+, \[x0, x2, lsl 3\]
> > > > > +**     ...
> > > > > +*/
> > > > > +void f2 (float64_t *a, int n)
> > > > > +{
> > > > > +  for (int i = 0; i < (n & -8); i++)
> > > > > +   a[i] = -fabs (a[i]);
> > > > > +}
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c
> > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c
> > > > > new file mode 100644
> > > > > index
> > > > >
> 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d0
> 1f66
> > > > > 04ca7be87e3744d494
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c
> > > > > @@ -0,0 +1,37 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3" } */
> > > > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64
> > > > > +} } } */
> > > > > +
> > > > > +#include <string.h>
> > > > > +
> > > > > +/*
> > > > > +** negabs:
> > > > > +**     mov     x0, -9223372036854775808
> > > > > +**     fmov    d[0-9]+, x0
> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> > > > > +**     ret
> > > > > +*/
> > > > > +double negabs (double x)
> > > > > +{
> > > > > +   unsigned long long y;
> > > > > +   memcpy (&y, &x, sizeof(double));
> > > > > +   y = y | (1UL << 63);
> > > > > +   memcpy (&x, &y, sizeof(double));
> > > > > +   return x;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > +** negabsf:
> > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24
> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> > > > > +**     ret
> > > > > +*/
> > > > > +float negabsf (float x)
> > > > > +{
> > > > > +   unsigned int y;
> > > > > +   memcpy (&y, &x, sizeof(float));
> > > > > +   y = y | (1U << 31);
> > > > > +   memcpy (&x, &y, sizeof(float));
> > > > > +   return x;
> > > > > +}
> > > > > +
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > >
> > >
> > > --
> > > Richard Biener <rguent...@suse.de>
> > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> > > Nuernberg, Germany;
> > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > Nuernberg)
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)

Reply via email to