RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-18 Thread Tamar Christina
> > It should be a match.pd rule that uses a match predicate, so expand in
> > gimple-match.c. but don't do this if the target doesn't have the
> > xorsign optab and don't do it if honouring SNAN.
> 
> Note that this will trigger too early (IMHO), so unless you feel like 
> inventing
> new infrastructure I'd put manual pattern matching in tree-ssa-math-opts.c
> pass_optimize_widening_mul where we currently do this kind of "late
> GIMPLE instruction selection".
> 

Alright, I'll do that then, thanks!

> Richard.
> 
> > I'll make the changes then.
> > Thanks,
> > Tamar
> >
> > >
> > > Think of a combine pass combining GIMPLE stmts to (recognized) RTL
> > > insn (sequences).  Until RTL expansion the RTL insn (sequence) would
> > > be represented by an internal function call (or alternatively for
> > > multi-output cases an GIMPLE ASM with enumerated asm text).
> > >
> > > Richard.
> > >
> > > > > Thanks,
> > > > > Richard.
> > > > >
> > > > > >
> > > > > > gcc/
> > > > > > 2017-07-10  Tamar Christina  <tamar.christ...@arm.com>
> > > > > > Andrew Pinski <pins...@gmail.com>
> > > > > >
> > > > > > PR middle-end/19706
> > > > > > * expr.c (is_copysign_call_with_1): New.
> > > > > > (maybe_expand_mult_copysign): Likewise.
> > > > > > (expand_expr_real_2): Expand copysign.
> > > > > >     * optabs.def (xorsign_optab): New.
> > > > > >
> > > > > > 
> > > > > > From: Andrew Pinski <pins...@gmail.com>
> > > > > > Sent: Monday, July 10, 2017 12:21:29 AM
> > > > > > To: Tamar Christina
> > > > > > Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com;
> > > > > > rguent...@suse.de
> > > > > > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0,
> > > > > > y) [Patch (1/2)]
> > > > > >
> > > > > > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > > > > > <tamar.christ...@arm.com> wrote:
> > > > > > > Hi All,
> > > > > > >
> > > > > > > this patch implements a optimization rewriting
> > > > > > >
> > > > > > > x * copysign (1.0, y) and
> > > > > > > x * copysign (-1.0, y)
> > > > > > >
> > > > > > > to:
> > > > > > >
> > > > > > > x ^ (y & (1 << sign_bit_position))
> > > > > > >
> > > > > > > This is done by creating a special builtin during matching
> > > > > > > and generate the appropriate instructions during expand.
> > > > > > > This new builtin is
> > > > > called XORSIGN.
> > > > > > >
> > > > > > > The expansion of xorsign depends on if the backend has an
> > > > > > > appropriate optab available. If this is not the case then we
> > > > > > > use a modified version of the existing copysign which does
> > > > > > > not take the abs
> > > > > value of the first argument as a fall back.
> > > > > > >
> > > > > > > This patch is a revival of a previous patch
> > > > > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > > > > > >
> > > > > > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with
> > > > > > > no
> > > > > issues.
> > > > > > > Regression done on aarch64-none-linux-gnu and no regressions.
> > > > > >
> > > > > >
> > > > > > Note this is also PR 19706.
> > > > > >
> > > > > > Thanks,
> > > > > > Andrew
> > > > > >
> > > > > > >
> > > > > > > Ok for trunk?
> > > > > > >
> > > > > > > gcc/
> > > > > > > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> > > > > > >
> > > > > > > * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF):
> New.
> > > > > > > (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLO

RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-18 Thread Richard Biener
On Tue, 18 Jul 2017, Tamar Christina wrote:

> 
> > I see.  But the implementation challenge is that this interacts badly with 
> > SSA
> > coalescing done before this and thus should really happen on GIMPLE before
> > that.
> > 
> > And yes, I also like to see more of this, it's basically doing some 
> > instruction
> > selection on (late) GIMPLE.  Ideally we'd be able to generate an expand.pd
> > match.pd variant from the machine description (named) define_insns,
> > creating IFNs that we know how to expand.
> 
> Fair enough, Just to check I understood correctly.
> 
> It should be a match.pd rule that uses a match predicate, so expand in 
> gimple-match.c. but don't do this if the target doesn't have the xorsign 
> optab and don't do it if honouring SNAN.

Note that this will trigger too early (IMHO), so unless you feel
like inventing new infrastructure I'd put manual pattern matching in
tree-ssa-math-opts.c pass_optimize_widening_mul where we currently
do this kind of "late GIMPLE instruction selection".

Richard.

> I'll make the changes then.
> Thanks,
> Tamar
> 
> > 
> > Think of a combine pass combining GIMPLE stmts to (recognized) RTL insn
> > (sequences).  Until RTL expansion the RTL insn (sequence) would be
> > represented by an internal function call (or alternatively for multi-output
> > cases an GIMPLE ASM with enumerated asm text).
> > 
> > Richard.
> > 
> > > > Thanks,
> > > > Richard.
> > > >
> > > > >
> > > > > gcc/
> > > > > 2017-07-10  Tamar Christina  <tamar.christ...@arm.com>
> > > > >   Andrew Pinski <pins...@gmail.com>
> > > > >
> > > > >   PR middle-end/19706
> > > > >   * expr.c (is_copysign_call_with_1): New.
> > > > >   (maybe_expand_mult_copysign): Likewise.
> > > > >   (expand_expr_real_2): Expand copysign.
> > > > >       * optabs.def (xorsign_optab): New.
> > > > >
> > > > > 
> > > > > From: Andrew Pinski <pins...@gmail.com>
> > > > > Sent: Monday, July 10, 2017 12:21:29 AM
> > > > > To: Tamar Christina
> > > > > Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com;
> > > > > rguent...@suse.de
> > > > > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y)
> > > > > [Patch (1/2)]
> > > > >
> > > > > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > > > > <tamar.christ...@arm.com> wrote:
> > > > > > Hi All,
> > > > > >
> > > > > > this patch implements a optimization rewriting
> > > > > >
> > > > > > x * copysign (1.0, y) and
> > > > > > x * copysign (-1.0, y)
> > > > > >
> > > > > > to:
> > > > > >
> > > > > > x ^ (y & (1 << sign_bit_position))
> > > > > >
> > > > > > This is done by creating a special builtin during matching and
> > > > > > generate the appropriate instructions during expand. This new
> > > > > > builtin is
> > > > called XORSIGN.
> > > > > >
> > > > > > The expansion of xorsign depends on if the backend has an
> > > > > > appropriate optab available. If this is not the case then we use
> > > > > > a modified version of the existing copysign which does not take
> > > > > > the abs
> > > > value of the first argument as a fall back.
> > > > > >
> > > > > > This patch is a revival of a previous patch
> > > > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > > > > >
> > > > > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no
> > > > issues.
> > > > > > Regression done on aarch64-none-linux-gnu and no regressions.
> > > > >
> > > > >
> > > > > Note this is also PR 19706.
> > > > >
> > > > > Thanks,
> > > > > Andrew
> > > > >
> > > > > >
> > > > > > Ok for trunk?
> > > > > >
> > > > > > gcc/
> > > > > > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> > > > > >
> > > > > > * buil

RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-18 Thread Tamar Christina

> I see.  But the implementation challenge is that this interacts badly with SSA
> coalescing done before this and thus should really happen on GIMPLE before
> that.
> 
> And yes, I also like to see more of this, it's basically doing some 
> instruction
> selection on (late) GIMPLE.  Ideally we'd be able to generate an expand.pd
> match.pd variant from the machine description (named) define_insns,
> creating IFNs that we know how to expand.

Fair enough, Just to check I understood correctly.

It should be a match.pd rule that uses a match predicate, so expand in 
gimple-match.c.
but don't do this if the target doesn't have the xorsign optab and don't do it 
if honouring SNAN.

I'll make the changes then.
Thanks,
Tamar

> 
> Think of a combine pass combining GIMPLE stmts to (recognized) RTL insn
> (sequences).  Until RTL expansion the RTL insn (sequence) would be
> represented by an internal function call (or alternatively for multi-output
> cases an GIMPLE ASM with enumerated asm text).
> 
> Richard.
> 
> > > Thanks,
> > > Richard.
> > >
> > > >
> > > > gcc/
> > > > 2017-07-10  Tamar Christina  <tamar.christ...@arm.com>
> > > > Andrew Pinski <pins...@gmail.com>
> > > >
> > > > PR middle-end/19706
> > > > * expr.c (is_copysign_call_with_1): New.
> > > > (maybe_expand_mult_copysign): Likewise.
> > > > (expand_expr_real_2): Expand copysign.
> > > > * optabs.def (xorsign_optab): New.
> > > >
> > > > 
> > > > From: Andrew Pinski <pins...@gmail.com>
> > > > Sent: Monday, July 10, 2017 12:21:29 AM
> > > > To: Tamar Christina
> > > > Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com;
> > > > rguent...@suse.de
> > > > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y)
> > > > [Patch (1/2)]
> > > >
> > > > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > > > <tamar.christ...@arm.com> wrote:
> > > > > Hi All,
> > > > >
> > > > > this patch implements a optimization rewriting
> > > > >
> > > > > x * copysign (1.0, y) and
> > > > > x * copysign (-1.0, y)
> > > > >
> > > > > to:
> > > > >
> > > > > x ^ (y & (1 << sign_bit_position))
> > > > >
> > > > > This is done by creating a special builtin during matching and
> > > > > generate the appropriate instructions during expand. This new
> > > > > builtin is
> > > called XORSIGN.
> > > > >
> > > > > The expansion of xorsign depends on if the backend has an
> > > > > appropriate optab available. If this is not the case then we use
> > > > > a modified version of the existing copysign which does not take
> > > > > the abs
> > > value of the first argument as a fall back.
> > > > >
> > > > > This patch is a revival of a previous patch
> > > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > > > >
> > > > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no
> > > issues.
> > > > > Regression done on aarch64-none-linux-gnu and no regressions.
> > > >
> > > >
> > > > Note this is also PR 19706.
> > > >
> > > > Thanks,
> > > > Andrew
> > > >
> > > > >
> > > > > Ok for trunk?
> > > > >
> > > > > gcc/
> > > > > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> > > > >
> > > > > * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > > > > (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX):
> Likewise.
> > > > > * match.pd (mult (COPYSIGN:s real_onep @0) @1): New
> simplifier.
> > > > > (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > > > > (copysigns @0 (negate @1)): Likewise.
> > > > > * builtins.c (expand_builtin_copysign): Promoted local to
> argument.
> > > > > (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> > > (BUILT_IN_XORSIGN) and
> > > > > CASE_FLT_FN (BUILT_IN_XORSIGN).
> > > > > (BUILT_IN_COPYSIGN): Updated function call.
> > > > > * optabs.h (expand_copysign): New bool.
> > > > > (expand_xorsign): New.
> > > > > * optabs.def (xorsign_optab): New.
> > > > > * optabs.c (expand_copysign): New parameter.
> > > > > * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > > > > * fortran/mathbuiltins.def (XORSIGN): New.
> > > > >
> > > > > gcc/testsuite/
> > > > > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> > > > >
> > > > > * gcc.dg/tree-ssa/xorsign.c: New.
> > > > > * gcc.dg/xorsign_exec.c: New.
> > > > > * gcc.dg/vec-xorsign_exec.c: New.
> > > > > * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 
> > > > > 2.
> > > >
> > >
> > > --
> > > Richard Biener <rguent...@suse.de>
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > Norton, HRB 21284 (AG Nuernberg)
> >
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)


RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-18 Thread Richard Biener
On Tue, 18 Jul 2017, Tamar Christina wrote:

> > 
> > Why's this now done during RTL expansion rather than during late GIMPLE,
> > using match.pd and an internal function for xorsign?
> > 
> 
> Mainly because of Andrew's email on the 10th which stated:
> 
> > But you should get the general idea.  I would like to see more of 
> > these special expand patterns really.
> 
> And there were no objections so I figured this was also an acceptable 
> solution.

I see.  But the implementation challenge is that this interacts badly
with SSA coalescing done before this and thus should really happen
on GIMPLE before that.

And yes, I also like to see more of this, it's basically doing some
instruction selection on (late) GIMPLE.  Ideally we'd be able to
generate an expand.pd match.pd variant from the machine
description (named) define_insns, creating IFNs that we know how
to expand.

Think of a combine pass combining GIMPLE stmts to (recognized)
RTL insn (sequences).  Until RTL expansion the RTL insn (sequence)
would be represented by an internal function call (or alternatively
for multi-output cases an GIMPLE ASM with enumerated asm text).

Richard.

> > Thanks,
> > Richard.
> > 
> > >
> > > gcc/
> > > 2017-07-10  Tamar Christina  <tamar.christ...@arm.com>
> > >   Andrew Pinski <pins...@gmail.com>
> > >
> > >   PR middle-end/19706
> > >   * expr.c (is_copysign_call_with_1): New.
> > >   (maybe_expand_mult_copysign): Likewise.
> > >   (expand_expr_real_2): Expand copysign.
> > >   * optabs.def (xorsign_optab): New.
> > >
> > > 
> > > From: Andrew Pinski <pins...@gmail.com>
> > > Sent: Monday, July 10, 2017 12:21:29 AM
> > > To: Tamar Christina
> > > Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com; rguent...@suse.de
> > > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y)
> > > [Patch (1/2)]
> > >
> > > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > > <tamar.christ...@arm.com> wrote:
> > > > Hi All,
> > > >
> > > > this patch implements a optimization rewriting
> > > >
> > > > x * copysign (1.0, y) and
> > > > x * copysign (-1.0, y)
> > > >
> > > > to:
> > > >
> > > > x ^ (y & (1 << sign_bit_position))
> > > >
> > > > This is done by creating a special builtin during matching and
> > > > generate the appropriate instructions during expand. This new builtin is
> > called XORSIGN.
> > > >
> > > > The expansion of xorsign depends on if the backend has an
> > > > appropriate optab available. If this is not the case then we use a
> > > > modified version of the existing copysign which does not take the abs
> > value of the first argument as a fall back.
> > > >
> > > > This patch is a revival of a previous patch
> > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > > >
> > > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no
> > issues.
> > > > Regression done on aarch64-none-linux-gnu and no regressions.
> > >
> > >
> > > Note this is also PR 19706.
> > >
> > > Thanks,
> > > Andrew
> > >
> > > >
> > > > Ok for trunk?
> > > >
> > > > gcc/
> > > > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> > > >
> > > > * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > > > (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> > > > * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> > > > (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > > > (copysigns @0 (negate @1)): Likewise.
> > > > * builtins.c (expand_builtin_copysign): Promoted local to 
> > > > argument.
> > > > (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> > (BUILT_IN_XORSIGN) and
> > > > CASE_FLT_FN (BUILT_IN_XORSIGN).
> > > > (BUILT_IN_COPYSIGN): Updated function call.
> > > > * optabs.h (expand_copysign): New bool.
> > > > (expand_xorsign): New.
> > > > * optabs.def (xorsign_optab): New.
> > > > * optabs.c (expand_copysign): New parameter.
> > > > * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > > > * fortran/mathbuiltins.def (XORSIGN): New.
> > > >
> > > > gcc/testsuite/
> > > > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> > > >
> > > > * gcc.dg/tree-ssa/xorsign.c: New.
> > > > * gcc.dg/xorsign_exec.c: New.
> > > > * gcc.dg/vec-xorsign_exec.c: New.
> > > > * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> > >
> > 
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-18 Thread Tamar Christina
> 
> Why's this now done during RTL expansion rather than during late GIMPLE,
> using match.pd and an internal function for xorsign?
> 

Mainly because of Andrew's email on the 10th which stated:

> But you should get the general idea.  I would like to see more of these 
> special expand patterns really.

And there were no objections so I figured this was also an acceptable solution.

> Thanks,
> Richard.
> 
> >
> > gcc/
> > 2017-07-10  Tamar Christina  <tamar.christ...@arm.com>
> > Andrew Pinski <pins...@gmail.com>
> >
> > PR middle-end/19706
> > * expr.c (is_copysign_call_with_1): New.
> > (maybe_expand_mult_copysign): Likewise.
> > (expand_expr_real_2): Expand copysign.
> > * optabs.def (xorsign_optab): New.
> >
> > 
> > From: Andrew Pinski <pins...@gmail.com>
> > Sent: Monday, July 10, 2017 12:21:29 AM
> > To: Tamar Christina
> > Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com; rguent...@suse.de
> > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y)
> > [Patch (1/2)]
> >
> > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > <tamar.christ...@arm.com> wrote:
> > > Hi All,
> > >
> > > this patch implements a optimization rewriting
> > >
> > > x * copysign (1.0, y) and
> > > x * copysign (-1.0, y)
> > >
> > > to:
> > >
> > > x ^ (y & (1 << sign_bit_position))
> > >
> > > This is done by creating a special builtin during matching and
> > > generate the appropriate instructions during expand. This new builtin is
> called XORSIGN.
> > >
> > > The expansion of xorsign depends on if the backend has an
> > > appropriate optab available. If this is not the case then we use a
> > > modified version of the existing copysign which does not take the abs
> value of the first argument as a fall back.
> > >
> > > This patch is a revival of a previous patch
> > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > >
> > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no
> issues.
> > > Regression done on aarch64-none-linux-gnu and no regressions.
> >
> >
> > Note this is also PR 19706.
> >
> > Thanks,
> > Andrew
> >
> > >
> > > Ok for trunk?
> > >
> > > gcc/
> > > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> > >
> > > * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > > (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> > > * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> > > (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > > (copysigns @0 (negate @1)): Likewise.
> > > * builtins.c (expand_builtin_copysign): Promoted local to 
> > > argument.
> > > (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> (BUILT_IN_XORSIGN) and
> > > CASE_FLT_FN (BUILT_IN_XORSIGN).
> > > (BUILT_IN_COPYSIGN): Updated function call.
> > > * optabs.h (expand_copysign): New bool.
> > > (expand_xorsign): New.
> > > * optabs.def (xorsign_optab): New.
> > > * optabs.c (expand_copysign): New parameter.
> > > * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > > * fortran/mathbuiltins.def (XORSIGN): New.
> > >
> > > gcc/testsuite/
> > > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> > >
> > > * gcc.dg/tree-ssa/xorsign.c: New.
> > > * gcc.dg/xorsign_exec.c: New.
> > > * gcc.dg/vec-xorsign_exec.c: New.
> > > * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)


Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-18 Thread Richard Biener
On Mon, 10 Jul 2017, Tamar Christina wrote:

> Hi All,
> 
> I've re-spun the patch with the changes requested.
> 
> 
> This is only done when not honoring signaling NaNs.
> This transormation is done at expand time by using
> a new optab "xorsign". If the optab is not available
> then copysign is expanded as normal.
> 
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
> 
> Ok for trunk?

+static rtx
+maybe_expand_mult_copysign (tree treeop0, tree treeop1, rtx target)
+{
+  tree type = TREE_TYPE (treeop0);
+  rtx op0, op1;
+
+  if (HONOR_SNANS (type))
+return NULL_RTX;
+
+  if (TREE_CODE (treeop0) == SSA_NAME && TREE_CODE (treeop1) == SSA_NAME)
+{
+  gimple *call0 = SSA_NAME_DEF_STMT (treeop0);

you can't lookup arbitrary def stmts during RTL expansion but you
have to go through get_gimple_for_ssa_name which may return NULL
if SSA name coalescing makes doing so unsafe.

Why's this now done during RTL expansion rather than during late
GIMPLE, using match.pd and an internal function for xorsign?

Thanks,
Richard.

> 
> gcc/
> 2017-07-10  Tamar Christina  <tamar.christ...@arm.com>
>   Andrew Pinski <pins...@gmail.com>
> 
>   PR middle-end/19706
>   * expr.c (is_copysign_call_with_1): New.
>   (maybe_expand_mult_copysign): Likewise.
>   (expand_expr_real_2): Expand copysign.
>   * optabs.def (xorsign_optab): New.
> 
> 
> From: Andrew Pinski <pins...@gmail.com>
> Sent: Monday, July 10, 2017 12:21:29 AM
> To: Tamar Christina
> Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com; rguent...@suse.de
> Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch 
> (1/2)]
> 
> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> <tamar.christ...@arm.com> wrote:
> > Hi All,
> >
> > this patch implements a optimization rewriting
> >
> > x * copysign (1.0, y) and
> > x * copysign (-1.0, y)
> >
> > to:
> >
> > x ^ (y & (1 << sign_bit_position))
> >
> > This is done by creating a special builtin during matching and generate the
> > appropriate instructions during expand. This new builtin is called XORSIGN.
> >
> > The expansion of xorsign depends on if the backend has an appropriate optab
> > available. If this is not the case then we use a modified version of the 
> > existing
> > copysign which does not take the abs value of the first argument as a fall 
> > back.
> >
> > This patch is a revival of a previous patch
> > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> >
> > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> > Regression done on aarch64-none-linux-gnu and no regressions.
> 
> 
> Note this is also PR 19706.
> 
> Thanks,
> Andrew
> 
> >
> > Ok for trunk?
> >
> > gcc/
> > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> >
> > * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> > * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> > (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > (copysigns @0 (negate @1)): Likewise.
> > * builtins.c (expand_builtin_copysign): Promoted local to argument.
> > (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> > CASE_FLT_FN (BUILT_IN_XORSIGN).
> > (BUILT_IN_COPYSIGN): Updated function call.
> > * optabs.h (expand_copysign): New bool.
> > (expand_xorsign): New.
> > * optabs.def (xorsign_optab): New.
> > * optabs.c (expand_copysign): New parameter.
> > * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > * fortran/mathbuiltins.def (XORSIGN): New.
> >
> > gcc/testsuite/
> > 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
> >
> > * gcc.dg/tree-ssa/xorsign.c: New.
> > * gcc.dg/xorsign_exec.c: New.
> > * gcc.dg/vec-xorsign_exec.c: New.
> > * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-18 Thread Tamar Christina
Ping.

From: Tamar Christina
Sent: Monday, July 10, 2017 4:47 PM
To: Andrew Pinski
Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com; rguent...@suse.de
Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

Hi All,

I've re-spun the patch with the changes requested.


This is only done when not honoring signaling NaNs.
This transormation is done at expand time by using
a new optab "xorsign". If the optab is not available
then copysign is expanded as normal.

Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
Regression done on aarch64-none-linux-gnu and no regressions.

Ok for trunk?

gcc/
2017-07-10  Tamar Christina  <tamar.christ...@arm.com>
Andrew Pinski <pins...@gmail.com>

PR middle-end/19706
* expr.c (is_copysign_call_with_1): New.
(maybe_expand_mult_copysign): Likewise.
(expand_expr_real_2): Expand copysign.
* optabs.def (xorsign_optab): New.


From: Andrew Pinski <pins...@gmail.com>
Sent: Monday, July 10, 2017 12:21:29 AM
To: Tamar Christina
Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com; rguent...@suse.de
Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
<tamar.christ...@arm.com> wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
>
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the 
> existing
> copysign which does not take the abs value of the first argument as a fall 
> back.
>
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.


Note this is also PR 19706.

Thanks,
Andrew

>
> Ok for trunk?
>
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
>
> * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> (copysigns @0 (negate @1)): Likewise.
> * builtins.c (expand_builtin_copysign): Promoted local to argument.
> (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> CASE_FLT_FN (BUILT_IN_XORSIGN).
> (BUILT_IN_COPYSIGN): Updated function call.
> * optabs.h (expand_copysign): New bool.
> (expand_xorsign): New.
> * optabs.def (xorsign_optab): New.
> * optabs.c (expand_copysign): New parameter.
> * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> * fortran/mathbuiltins.def (XORSIGN): New.
>
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
>
> * gcc.dg/tree-ssa/xorsign.c: New.
> * gcc.dg/xorsign_exec.c: New.
> * gcc.dg/vec-xorsign_exec.c: New.
> * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.


Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-10 Thread Tamar Christina
Hi All,

I've re-spun the patch with the changes requested.


This is only done when not honoring signaling NaNs.
This transormation is done at expand time by using
a new optab "xorsign". If the optab is not available
then copysign is expanded as normal.

Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
Regression done on aarch64-none-linux-gnu and no regressions.

Ok for trunk?

gcc/
2017-07-10  Tamar Christina  <tamar.christ...@arm.com>
Andrew Pinski <pins...@gmail.com>

PR middle-end/19706
* expr.c (is_copysign_call_with_1): New.
(maybe_expand_mult_copysign): Likewise.
(expand_expr_real_2): Expand copysign.
* optabs.def (xorsign_optab): New.


From: Andrew Pinski <pins...@gmail.com>
Sent: Monday, July 10, 2017 12:21:29 AM
To: Tamar Christina
Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com; rguent...@suse.de
Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
<tamar.christ...@arm.com> wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
>
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the 
> existing
> copysign which does not take the abs value of the first argument as a fall 
> back.
>
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.


Note this is also PR 19706.

Thanks,
Andrew

>
> Ok for trunk?
>
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
>
> * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> (copysigns @0 (negate @1)): Likewise.
> * builtins.c (expand_builtin_copysign): Promoted local to argument.
> (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> CASE_FLT_FN (BUILT_IN_XORSIGN).
> (BUILT_IN_COPYSIGN): Updated function call.
> * optabs.h (expand_copysign): New bool.
> (expand_xorsign): New.
> * optabs.def (xorsign_optab): New.
> * optabs.c (expand_copysign): New parameter.
> * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> * fortran/mathbuiltins.def (XORSIGN): New.
>
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
>
> * gcc.dg/tree-ssa/xorsign.c: New.
> * gcc.dg/xorsign_exec.c: New.
> * gcc.dg/vec-xorsign_exec.c: New.
> * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
diff --git a/gcc/expr.c b/gcc/expr.c
index 5febf07929d0add0ad0ae1356baef008524f0c7c..0193231bc857bdea3e02b2845e62883e1b5c291b 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8182,6 +8182,86 @@ expand_cond_expr_using_cmove (tree treeop0 ATTRIBUTE_UNUSED,
   return NULL_RTX;
 }
 
+/* Check to see if the CALL statement is an invocation of copysign
+   with 1. being the first argument.  */
+static bool
+is_copysign_call_with_1 (gimple *call)
+{
+  if (!is_gimple_call (call))
+return false;
+
+  enum combined_fn code = gimple_call_combined_fn (call);
+
+  if (code == CFN_LAST)
+return false;
+
+  gcall *c = as_a<gcall*> (call);
+
+  if (builtin_fn_p (code))
+{
+  switch (as_builtin_fn (code))
+	{
+	CASE_FLT_FN (BUILT_IN_COPYSIGN):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
+	  return real_onep (gimple_call_arg (c, 0));
+	default:
+	  return false;
+	}
+}
+
+  if (internal_fn_p (code))
+{
+  switch (as_internal_fn (code))
+	{
+	case IFN_COPYSIGN:
+	  return real_onep (gimple_call_arg (c, 0));
+	default:
+	  return false;
+	}
+}
+
+   return false;
+}
+
+/* Try to expand the pattern x * copysign (1, y) into xorsign (x, y).
+   This only happens when the the xorsign optab is defined, if the
+   pattern is not a xorsign pattern or if expension failes NULL_RTX is
+   returned, otherwise the RTX from the optab expansion is returned.  */
+static rtx
+maybe_expand_mult_copysign (tree treeop0, tree treeop1, rtx target)
+{
+  tree type = TREE_TYPE (treeop0);
+  rtx op0, op1;
+
+  if (HONOR_SNANS (type))
+return

Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-09 Thread Andrew Pinski
On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
 wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
>
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the 
> existing
> copysign which does not take the abs value of the first argument as a fall 
> back.
>
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.


Note this is also PR 19706.

Thanks,
Andrew

>
> Ok for trunk?
>
> gcc/
> 2017-06-07  Tamar Christina  
>
> * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> (copysigns @0 (negate @1)): Likewise.
> * builtins.c (expand_builtin_copysign): Promoted local to argument.
> (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> CASE_FLT_FN (BUILT_IN_XORSIGN).
> (BUILT_IN_COPYSIGN): Updated function call.
> * optabs.h (expand_copysign): New bool.
> (expand_xorsign): New.
> * optabs.def (xorsign_optab): New.
> * optabs.c (expand_copysign): New parameter.
> * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> * fortran/mathbuiltins.def (XORSIGN): New.
>
> gcc/testsuite/
> 2017-06-07  Tamar Christina  
>
> * gcc.dg/tree-ssa/xorsign.c: New.
> * gcc.dg/xorsign_exec.c: New.
> * gcc.dg/vec-xorsign_exec.c: New.
> * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.


Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-26 Thread Tamar Christina
Hi Andrew,

Thanks! I'll put together the rest today or tomorrow.
Sorry for the slow response on this one.

Tamar

From: Andrew Pinski <pins...@gmail.com>
Sent: Monday, June 26, 2017 3:09:54 AM
To: Tamar Christina
Cc: GCC Patches; nd; l...@redhat.com; i...@airs.com; rguent...@suse.de
Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

On Sat, Jun 24, 2017 at 4:53 PM, Andrew Pinski <pins...@gmail.com> wrote:
> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> <tamar.christ...@arm.com> wrote:
>> Hi All,
>>
>> this patch implements a optimization rewriting
>>
>> x * copysign (1.0, y) and
>> x * copysign (-1.0, y)
>
>
> This reminds me:
> copysign(-1.0, y) can be just optimized to:
> copysign(1.0, y)
>
> I did that in my patch here:
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01860.html

I updated the patch to handle all constants and not just -1.0.

>
> This should allow you to reduce the number of patterns needed to match here.
> Note I still think we could do this in expand without a new
> builtin/internal function.
> I might go and code that up soonish.

Also something like attached (NOTE this is NOT a full patch and needs
the xorsign optabs part of your patch) should work for the expand side
rather than creating a new builtin.  There still needs to handling of
the vector based copysign.  But you should get the general idea.  I
would like to see more of these special expand patterns really.

NOTE you can remove the target hook part and just check if xorsign
optab is there.  I don't know if that is what we want to do if not
allow for generic expanding of this.

Thanks,
Andrew Pinski


>
> Thanks,
> Andrew
>
>>
>> to:
>>
>> x ^ (y & (1 << sign_bit_position))
>>
>> This is done by creating a special builtin during matching and generate the
>> appropriate instructions during expand. This new builtin is called XORSIGN.
>>
>> The expansion of xorsign depends on if the backend has an appropriate optab
>> available. If this is not the case then we use a modified version of the 
>> existing
>> copysign which does not take the abs value of the first argument as a fall 
>> back.
>>
>> This patch is a revival of a previous patch
>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>>
>> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
>> Regression done on aarch64-none-linux-gnu and no regressions.
>>
>> Ok for trunk?
>>
>> gcc/
>> 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
>>
>> * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>> (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>> * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>> (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>> (copysigns @0 (negate @1)): Likewise.
>> * builtins.c (expand_builtin_copysign): Promoted local to argument.
>> (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>> CASE_FLT_FN (BUILT_IN_XORSIGN).
>> (BUILT_IN_COPYSIGN): Updated function call.
>> * optabs.h (expand_copysign): New bool.
>> (expand_xorsign): New.
>> * optabs.def (xorsign_optab): New.
>> * optabs.c (expand_copysign): New parameter.
>> * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>> * fortran/mathbuiltins.def (XORSIGN): New.
>>
>> gcc/testsuite/
>> 2017-06-07  Tamar Christina  <tamar.christ...@arm.com>
>>
>> * gcc.dg/tree-ssa/xorsign.c: New.
>> * gcc.dg/xorsign_exec.c: New.
>> * gcc.dg/vec-xorsign_exec.c: New.
>> * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.


Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-26 Thread Richard Biener
On Sat, 24 Jun 2017, Andrew Pinski wrote:

> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
>  wrote:
> > Hi All,
> >
> > this patch implements a optimization rewriting
> >
> > x * copysign (1.0, y) and
> > x * copysign (-1.0, y)
> 
> 
> This reminds me:
> copysign(-1.0, y) can be just optimized to:
> copysign(1.0, y)

I think I suggested that in my earlie review.

> I did that in my patch here:
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01860.html
> 
> This should allow you to reduce the number of patterns needed to match here.
> Note I still think we could do this in expand without a new
> builtin/internal function.
> I might go and code that up soonish.
> 
> Thanks,
> Andrew
> 
> >
> > to:
> >
> > x ^ (y & (1 << sign_bit_position))
> >
> > This is done by creating a special builtin during matching and generate the
> > appropriate instructions during expand. This new builtin is called XORSIGN.
> >
> > The expansion of xorsign depends on if the backend has an appropriate optab
> > available. If this is not the case then we use a modified version of the 
> > existing
> > copysign which does not take the abs value of the first argument as a fall 
> > back.
> >
> > This patch is a revival of a previous patch
> > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> >
> > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> > Regression done on aarch64-none-linux-gnu and no regressions.
> >
> > Ok for trunk?
> >
> > gcc/
> > 2017-06-07  Tamar Christina  
> >
> > * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> > * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> > (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > (copysigns @0 (negate @1)): Likewise.
> > * builtins.c (expand_builtin_copysign): Promoted local to argument.
> > (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> > CASE_FLT_FN (BUILT_IN_XORSIGN).
> > (BUILT_IN_COPYSIGN): Updated function call.
> > * optabs.h (expand_copysign): New bool.
> > (expand_xorsign): New.
> > * optabs.def (xorsign_optab): New.
> > * optabs.c (expand_copysign): New parameter.
> > * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > * fortran/mathbuiltins.def (XORSIGN): New.
> >
> > gcc/testsuite/
> > 2017-06-07  Tamar Christina  
> >
> > * gcc.dg/tree-ssa/xorsign.c: New.
> > * gcc.dg/xorsign_exec.c: New.
> > * gcc.dg/vec-xorsign_exec.c: New.
> > * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-25 Thread Andrew Pinski
On Sat, Jun 24, 2017 at 4:53 PM, Andrew Pinski  wrote:
> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
>  wrote:
>> Hi All,
>>
>> this patch implements a optimization rewriting
>>
>> x * copysign (1.0, y) and
>> x * copysign (-1.0, y)
>
>
> This reminds me:
> copysign(-1.0, y) can be just optimized to:
> copysign(1.0, y)
>
> I did that in my patch here:
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01860.html

I updated the patch to handle all constants and not just -1.0.

>
> This should allow you to reduce the number of patterns needed to match here.
> Note I still think we could do this in expand without a new
> builtin/internal function.
> I might go and code that up soonish.

Also something like attached (NOTE this is NOT a full patch and needs
the xorsign optabs part of your patch) should work for the expand side
rather than creating a new builtin.  There still needs to handling of
the vector based copysign.  But you should get the general idea.  I
would like to see more of these special expand patterns really.

NOTE you can remove the target hook part and just check if xorsign
optab is there.  I don't know if that is what we want to do if not
allow for generic expanding of this.

Thanks,
Andrew Pinski


>
> Thanks,
> Andrew
>
>>
>> to:
>>
>> x ^ (y & (1 << sign_bit_position))
>>
>> This is done by creating a special builtin during matching and generate the
>> appropriate instructions during expand. This new builtin is called XORSIGN.
>>
>> The expansion of xorsign depends on if the backend has an appropriate optab
>> available. If this is not the case then we use a modified version of the 
>> existing
>> copysign which does not take the abs value of the first argument as a fall 
>> back.
>>
>> This patch is a revival of a previous patch
>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>>
>> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
>> Regression done on aarch64-none-linux-gnu and no regressions.
>>
>> Ok for trunk?
>>
>> gcc/
>> 2017-06-07  Tamar Christina  
>>
>> * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>> (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>> * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>> (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>> (copysigns @0 (negate @1)): Likewise.
>> * builtins.c (expand_builtin_copysign): Promoted local to argument.
>> (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>> CASE_FLT_FN (BUILT_IN_XORSIGN).
>> (BUILT_IN_COPYSIGN): Updated function call.
>> * optabs.h (expand_copysign): New bool.
>> (expand_xorsign): New.
>> * optabs.def (xorsign_optab): New.
>> * optabs.c (expand_copysign): New parameter.
>> * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>> * fortran/mathbuiltins.def (XORSIGN): New.
>>
>> gcc/testsuite/
>> 2017-06-07  Tamar Christina  
>>
>> * gcc.dg/tree-ssa/xorsign.c: New.
>> * gcc.dg/xorsign_exec.c: New.
>> * gcc.dg/vec-xorsign_exec.c: New.
>> * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
Index: gcc/expr.c
===
--- gcc/expr.c  (revision 249619)
+++ gcc/expr.c  (working copy)
@@ -8182,6 +8182,59 @@
   return NULL_RTX;
 }
 
+static bool
+is_copysign_call_with_1 (gimple *call)
+{
+  if (!is_gimple_call (call))
+return false;
+
+  if (gimple_call_builtin_p (call, BUILT_IN_NORMAL))
+{
+  gcall *c = as_a (call);
+  tree decl = gimple_call_fndecl (call);
+  switch (DECL_FUNCTION_CODE (decl))
+   {
+CASE_FLT_FN (BUILT_IN_COPYSIGN):
+   CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
+ return real_one_p (gimple_call_arg (c, 0));
+   default:
+ return false;
+   }
+}
+}
+
+static rtx
+maybe_expand_mult_copysign (tree treeop0, tree treeop1, rtx target)
+{
+  tree type = TREE_TYPE (treeop0);
+  rtx op0, op1;
+
+  if (!SCALAR_FLOAT_TYPE_P (type)
+  && VECTOR_FLOAT_TYPE_P (type))
+return NULL;
+
+  if (HONOR_SNANS (type))
+return NULL;
+
+  if (!targetm.expand_mult_copysign_xor ())
+return NULL;
+
+  if (TREE_CODE (treeop0) == SSA_NAME)
+{
+  gimple *call0 = SSA_NAME_DEF_STMT (treeop0);
+  if (is_copysign_call_with_1 (call0))
+   {
+ gcall *c = as_a (call0);
+ treeop0 = gimple_call_arg (c, 1);
+ expand_operands (treeop1, treeop0, NULL_RTX, , , 
EXPAND_NORMAL);
+ return expand_copysign (op0, op1, target, true);
+   }
+}
+
+  return NULL;
+}
+
+
 rtx
 expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
enum expand_modifier modifier)
@@ -8791,6 +8844,10 @@
   if (modifier == EXPAND_STACK_PARM)
target = 0;
 
+  temp = 

Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-24 Thread Andrew Pinski
On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
 wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)


This reminds me:
copysign(-1.0, y) can be just optimized to:
copysign(1.0, y)

I did that in my patch here:
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01860.html

This should allow you to reduce the number of patterns needed to match here.
Note I still think we could do this in expand without a new
builtin/internal function.
I might go and code that up soonish.

Thanks,
Andrew

>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
>
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the 
> existing
> copysign which does not take the abs value of the first argument as a fall 
> back.
>
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
>
> Ok for trunk?
>
> gcc/
> 2017-06-07  Tamar Christina  
>
> * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> (copysigns @0 (negate @1)): Likewise.
> * builtins.c (expand_builtin_copysign): Promoted local to argument.
> (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> CASE_FLT_FN (BUILT_IN_XORSIGN).
> (BUILT_IN_COPYSIGN): Updated function call.
> * optabs.h (expand_copysign): New bool.
> (expand_xorsign): New.
> * optabs.def (xorsign_optab): New.
> * optabs.c (expand_copysign): New parameter.
> * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> * fortran/mathbuiltins.def (XORSIGN): New.
>
> gcc/testsuite/
> 2017-06-07  Tamar Christina  
>
> * gcc.dg/tree-ssa/xorsign.c: New.
> * gcc.dg/xorsign_exec.c: New.
> * gcc.dg/vec-xorsign_exec.c: New.
> * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.


RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-13 Thread Tamar Christina
Hi Richard,

> > First, nowadays please add an internal function instead of builtins.
> > You can even take advantage of Richards work to directly tie those to
> > optabs (he might want to chime in to tell you how).  You don't need
> > the fortran FE changes in that case.
> 
> Yeah, it should just be a case of adding:
> 
> DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary)
> 
> to internal-fn.def.  The supposedly useful thing about this is that it
> automatically extends to vectors, so you shouldn't need the xorsign vector
> builtins or the aarch64_builtin_vectorized_function change.

Ah, ok, thanks! I'll change it to an internal function.
And take a look at the testcases for the updated patch. 

> However, we don't yet support SLP vectorisation of internal functions.
> I have a patch for that that I've been looking for an excuse to post (at the
> moment I think it only helps SVE).  If this goes in I can post it as a 
> follow-on.
> 
> In:
> 
> > diff --git a/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > new file mode 100644
> > index
> >
> ..f8c8befd336c7f2743a1621d3b
> 0f
> > 53d78bab9df7
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > @@ -0,0 +1,53 @@
> > +/* { dg-do run } */
> > +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> > +/* { dg-additional-options "-march=armv8-a" { target { aarch64*-*-* }
> > +} }*/
> > +
> > +extern void abort ();
> > +
> > +#define N 16
> > +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f,
> > + -12.5f, -15.6f, -18.7f, -21.8f,
> > + 24.9f, 27.1f, 30.2f, 33.3f,
> > + 36.4f, 39.5f, 42.6f, 45.7f};
> > +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f,
> > + -9.0f, 1.0f, -2.0f, 3.0f,
> > + -4.0f, -5.0f, 6.0f, 7.0f,
> > + -8.0f, -9.0f, 10.0f, 11.0f};
> > +float r[N];
> > +
> > +float ad[N] = {-0.1fd,  -3.2d,  -6.3d,  -9.4d,
> > +   -12.5d, -15.6d, -18.7d, -21.8d,
> > +24.9d,  27.1d,  30.2d,  33.3d,
> > +36.4d,  39.5d,  42.6d, 45.7d}; float bd[N] = {-1.2d,
> > +3.4d, -5.6d,  7.8d,
> > +   -9.0d,  1.0d, -2.0d,  3.0d,
> > +   -4.0d, -5.0d,  6.0d,  7.0d,
> > +   -8.0d, -9.0d, 10.0d, 11.0d}; float rd[N];
> 
> Looks like these last three were meant to be doubles.
> 
> > +
> > +int
> > +main (void)
> > +{
> > +  int i;
> > +
> > +  for (i = 0; i < N; i++)
> > +r[i] = a[i] * _builtin_copysignf (1.0f, b[i]);
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i]))
> > +  abort ();
> > +
> > +  for (i = 0; i < N; i++)
> > +rd[i] = ad[i] * _builtin_copysignd (1.0d, bd[i]);
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +if (r[i] != ad[i] * __builtin_copysignd (1.0d, bd[i]))
> > +  abort ();
> > +
> > +
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" }
> > +} */
> 
> Why does only one loop get vectorised?
> 
> Thanks,
> Richard


RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-13 Thread Tamar Christina
Hi Richard,

Thanks for the feedback, I'll update the patch accordingly.

> What does
> 
> >   (copysigns @0 (negate @1)): Likewise.
> 
> do?
> 

Sorry this slipped through my clean-up. The patch doesn't actually contain this 
definition anymore.

> Third, new IL that is present throughout the compilation always poses the
> risk that while passes may be able to handle copysign they do not handle
> xorsign (vectorization?).  In this case it looks like the matching is simply 
> to
> enhance RTL expansion which means it should ideally be done close to RTL
> expansion only.  If you write
> 
> (match (xorsign_p @0 @1)
>  (mult:c (copysign real_onep @0) @1))
> 
> you can call gimple_xorsign_p (you need to declare it, see the generated
> gimple-match.c file for the definition) from, say,
> pass_optimize_widening_mul, which despite its name is used as a kitchen-
> sink for late GIMPLE pattern-matching stuff to enhance RTL expansion /
> instruction selection.
> 
> Thanks,
> Richard.
> 
> >
> > gcc/
> > 2017-06-07  Tamar Christina  
> >
> > * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> > * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> > (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > (copysigns @0 (negate @1)): Likewise.
> > * builtins.c (expand_builtin_copysign): Promoted local to argument.
> > (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> (BUILT_IN_XORSIGN) and
> > CASE_FLT_FN (BUILT_IN_XORSIGN).
> > (BUILT_IN_COPYSIGN): Updated function call.
> > * optabs.h (expand_copysign): New bool.
> > (expand_xorsign): New.
> > * optabs.def (xorsign_optab): New.
> > * optabs.c (expand_copysign): New parameter.
> > * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > * fortran/mathbuiltins.def (XORSIGN): New.
> >
> > gcc/testsuite/
> > 2017-06-07  Tamar Christina  
> >
> > * gcc.dg/tree-ssa/xorsign.c: New.
> > * gcc.dg/xorsign_exec.c: New.
> > * gcc.dg/vec-xorsign_exec.c: New.
> > * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> 
> --
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)


RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-13 Thread Tamar Christina


> 
> Please only enable this if you have XORSIGN and XORSIGNF.
> 
> On the PowerPC this would involve moving the value from the
> vector/floating point registers to the general purpose registers to do the XOR
> operation and then back to the vector/floating point registers.
> 

Fair enough, I think using Richard's earlier change request this should be 
fairly simple.
I'll update the patch.

Thanks

> 
> --
> Michael Meissner, IBM
> IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
> email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-12 Thread Michael Meissner
On Mon, Jun 12, 2017 at 07:56:54AM +, Tamar Christina wrote:
> Hi All,
> 
> this patch implements a optimization rewriting
> 
> x * copysign (1.0, y) and 
> x * copysign (-1.0, y) 
> 
> to:
> 
> x ^ (y & (1 << sign_bit_position))
> 
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
> 
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the 
> existing
> copysign which does not take the abs value of the first argument as a fall 
> back.
> 
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> 
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
> 
> Ok for trunk?

Please only enable this if you have XORSIGN and XORSIGNF.

On the PowerPC this would involve moving the value from the vector/floating
point registers to the general purpose registers to do the XOR operation and
then back to the vector/floating point registers.

Note, the PowerPC has an instruction that does copysign directly.  It would be
better just to do the copysign/multiply.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-12 Thread Joseph Myers
On Mon, 12 Jun 2017, Tamar Christina wrote:

> x * copysign (1.0, y) and 
> x * copysign (-1.0, y) 
> 
> to:
> 
> x ^ (y & (1 << sign_bit_position))

Note that this needs to be disabled for -fsignaling-nans, as if x is a 
signaling NaN, the multiplication converts it to a quiet NaN and raises 
"invalid".

> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.

If the built-in function has a user-visible name such as __builtin_xorsign 
(as opposed to one that's not a C identifier), it needs to be documented 
as a user-visible feature.  I'd suggest not having such a user-visible 
built-in function.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-12 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, 12 Jun 2017, Tamar Christina wrote:
>> Hi All,
>> 
>> this patch implements a optimization rewriting
>> 
>> x * copysign (1.0, y) and 
>> x * copysign (-1.0, y) 
>> 
>> to:
>> 
>> x ^ (y & (1 << sign_bit_position))
>> 
>> This is done by creating a special builtin during matching and generate the
>> appropriate instructions during expand. This new builtin is called XORSIGN.
>> 
>> The expansion of xorsign depends on if the backend has an appropriate optab
>> available. If this is not the case then we use a modified version of
>> the existing
>> copysign which does not take the abs value of the first argument as a
>> fall back.
>> 
>> This patch is a revival of a previous patch
>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>> 
>> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
>> Regression done on aarch64-none-linux-gnu and no regressions.
>> 
>> Ok for trunk?
>
> Without looking at the patch a few comments.
>
> First, nowadays please add an internal function instead of builtins.
> You can even take advantage of Richards work to directly tie those
> to optabs (he might want to chime in to tell you how).  You don't need
> the fortran FE changes in that case.

Yeah, it should just be a case of adding:

DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary)

to internal-fn.def.  The supposedly useful thing about this is that it
automatically extends to vectors, so you shouldn't need the xorsign
vector builtins or the aarch64_builtin_vectorized_function change.

However, we don't yet support SLP vectorisation of internal functions.
I have a patch for that that I've been looking for an excuse to post
(at the moment I think it only helps SVE).  If this goes in I can
post it as a follow-on.

In:

> diff --git a/gcc/testsuite/gcc.dg/vec-xorsign_exec.c 
> b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> new file mode 100644
> index 
> ..f8c8befd336c7f2743a1621d3b0f53d78bab9df7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> @@ -0,0 +1,53 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> +/* { dg-additional-options "-march=armv8-a" { target { aarch64*-*-* } } }*/
> +
> +extern void abort ();
> +
> +#define N 16
> +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f,
> +   -12.5f, -15.6f, -18.7f, -21.8f,
> +   24.9f, 27.1f, 30.2f, 33.3f,
> +   36.4f, 39.5f, 42.6f, 45.7f};
> +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f,
> +   -9.0f, 1.0f, -2.0f, 3.0f,
> +   -4.0f, -5.0f, 6.0f, 7.0f,
> +   -8.0f, -9.0f, 10.0f, 11.0f};
> +float r[N];
> +
> +float ad[N] = {-0.1fd,  -3.2d,  -6.3d,  -9.4d,
> +   -12.5d, -15.6d, -18.7d, -21.8d,
> +24.9d,  27.1d,  30.2d,  33.3d,
> +36.4d,  39.5d,  42.6d, 45.7d};
> +float bd[N] = {-1.2d,  3.4d, -5.6d,  7.8d,
> +   -9.0d,  1.0d, -2.0d,  3.0d,
> +   -4.0d, -5.0d,  6.0d,  7.0d,
> +   -8.0d, -9.0d, 10.0d, 11.0d};
> +float rd[N];

Looks like these last three were meant to be doubles.

> +
> +int
> +main (void)
> +{
> +  int i;
> +
> +  for (i = 0; i < N; i++)
> +r[i] = a[i] * _builtin_copysignf (1.0f, b[i]);
> +
> +  /* check results:  */
> +  for (i = 0; i < N; i++)
> +if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i]))
> +  abort ();
> +
> +  for (i = 0; i < N; i++)
> +rd[i] = ad[i] * _builtin_copysignd (1.0d, bd[i]);
> +
> +  /* check results:  */
> +  for (i = 0; i < N; i++)
> +if (r[i] != ad[i] * __builtin_copysignd (1.0d, bd[i]))
> +  abort ();
> +
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */

Why does only one loop get vectorised?

Thanks,
Richard


Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-12 Thread Richard Biener
On Mon, 12 Jun 2017, Tamar Christina wrote:

> Hi All,
> 
> this patch implements a optimization rewriting
> 
> x * copysign (1.0, y) and 
> x * copysign (-1.0, y) 
> 
> to:
> 
> x ^ (y & (1 << sign_bit_position))
> 
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
> 
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the 
> existing
> copysign which does not take the abs value of the first argument as a fall 
> back.
> 
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> 
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
> 
> Ok for trunk?

Without looking at the patch a few comments.

First, nowadays please add an internal function instead of builtins.
You can even take advantage of Richards work to directly tie those
to optabs (he might want to chime in to tell you how).  You don't need
the fortran FE changes in that case.

Second, I think we should canonicalize copysign (-CST, x) to
copysign (CST, x), thus positive real constants.  You should be
able to use

(simplify
 (copysign negate_expr_p@0 @1)
 (copysign @0 @1))

to catch both the (negate @0) and REAL_CST case.

What does

>   (copysigns @0 (negate @1)): Likewise.

do?

Third, new IL that is present throughout the compilation always
poses the risk that while passes may be able to handle copysign
they do not handle xorsign (vectorization?).  In this case it
looks like the matching is simply to enhance RTL expansion
which means it should ideally be done close to RTL expansion
only.  If you write

(match (xorsign_p @0 @1)
 (mult:c (copysign real_onep @0) @1))

you can call gimple_xorsign_p (you need to declare it, see
the generated gimple-match.c file for the definition) from,
say, pass_optimize_widening_mul, which despite its name
is used as a kitchen-sink for late GIMPLE pattern-matching
stuff to enhance RTL expansion / instruction selection.

Thanks,
Richard.

> 
> gcc/
> 2017-06-07  Tamar Christina  
> 
>   * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>   (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>   * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>   (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>   (copysigns @0 (negate @1)): Likewise.
>   * builtins.c (expand_builtin_copysign): Promoted local to argument.
>   (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>   CASE_FLT_FN (BUILT_IN_XORSIGN).
>   (BUILT_IN_COPYSIGN): Updated function call.
>   * optabs.h (expand_copysign): New bool.
>   (expand_xorsign): New.
>   * optabs.def (xorsign_optab): New.
>   * optabs.c (expand_copysign): New parameter.
>   * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>   * fortran/mathbuiltins.def (XORSIGN): New.
> 
> gcc/testsuite/
> 2017-06-07  Tamar Christina  
> 
>   * gcc.dg/tree-ssa/xorsign.c: New.
>   * gcc.dg/xorsign_exec.c: New.
>   * gcc.dg/vec-xorsign_exec.c: New.
>   * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-12 Thread Tamar Christina
Hi All,

this patch implements a optimization rewriting

x * copysign (1.0, y) and 
x * copysign (-1.0, y) 

to:

x ^ (y & (1 << sign_bit_position))

This is done by creating a special builtin during matching and generate the
appropriate instructions during expand. This new builtin is called XORSIGN.

The expansion of xorsign depends on if the backend has an appropriate optab
available. If this is not the case then we use a modified version of the 
existing
copysign which does not take the abs value of the first argument as a fall back.

This patch is a revival of a previous patch
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html

Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
Regression done on aarch64-none-linux-gnu and no regressions.

Ok for trunk?

gcc/
2017-06-07  Tamar Christina  

* builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
(BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
* match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
(mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
(copysigns @0 (negate @1)): Likewise.
* builtins.c (expand_builtin_copysign): Promoted local to argument.
(expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
CASE_FLT_FN (BUILT_IN_XORSIGN).
(BUILT_IN_COPYSIGN): Updated function call.
* optabs.h (expand_copysign): New bool.
(expand_xorsign): New.
* optabs.def (xorsign_optab): New.
* optabs.c (expand_copysign): New parameter.
* fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
* fortran/mathbuiltins.def (XORSIGN): New.

gcc/testsuite/
2017-06-07  Tamar Christina  

* gcc.dg/tree-ssa/xorsign.c: New.
* gcc.dg/xorsign_exec.c: New.
* gcc.dg/vec-xorsign_exec.c: New.
* gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.diff --git a/gcc/builtins.c b/gcc/builtins.c
index 30462ad0f419721fd0aa2029dbc9f8f5593b5823..2a84bebf5f6235f84a0f46f15ba2fed67b1d5564 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5117,10 +5117,12 @@ expand_builtin_fabs (tree exp, rtx target, rtx subtarget)
 /* Expand EXP, a call to copysign, copysignf, or copysignl.
Return NULL is a normal call should be emitted rather than expanding the
function inline.  If convenient, the result should be placed in TARGET.
-   SUBTARGET may be used as the target for computing the operand.  */
+   SUBTARGET may be used as the target for computing the operand.
+   If OP0_NEEDS_ABS is true then abs() will be performed on the first
+   argument.  */
 
 static rtx
-expand_builtin_copysign (tree exp, rtx target, rtx subtarget)
+expand_builtin_copysign (tree exp, rtx target, rtx subtarget, bool op0_needs_abs)
 {
   rtx op0, op1;
   tree arg;
@@ -5134,7 +5136,7 @@ expand_builtin_copysign (tree exp, rtx target, rtx subtarget)
   arg = CALL_EXPR_ARG (exp, 1);
   op1 = expand_normal (arg);
 
-  return expand_copysign (op0, op1, target);
+  return expand_copysign (op0, op1, target, op0_needs_abs);
 }
 
 /* Expand a call to __builtin___clear_cache.  */
@@ -6586,7 +6588,14 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 
 CASE_FLT_FN (BUILT_IN_COPYSIGN):
 CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
-  target = expand_builtin_copysign (exp, target, subtarget);
+  target = expand_builtin_copysign (exp, target, subtarget, true);
+  if (target)
+	return target;
+  break;
+
+CASE_FLT_FN (BUILT_IN_XORSIGN):
+CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN):
+  target = expand_builtin_copysign (exp, target, subtarget, false);
   if (target)
 	return target;
   break;
@@ -7688,7 +7697,7 @@ builtin_mathfn_code (const_tree t)
   const_call_expr_arg_iterator iter;
 
   if (TREE_CODE (t) != CALL_EXPR
-  || TREE_CODE (CALL_EXPR_FN (t)) != ADDR_EXPR)
+  || (CALL_EXPR_FN (t) && TREE_CODE (CALL_EXPR_FN (t)) != ADDR_EXPR))
 return END_BUILTINS;
 
   fndecl = get_callee_fndecl (t);
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 58d78dbbdee58df77fb7bad904362327704403c5..9508fc35d622369ab5b89fc63d3add3728931279 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -325,6 +325,12 @@ DEF_C99_BUILTIN(BUILT_IN_COPYSIGNL, "copysignl", BT_FN_LONGDOUBLE_LONGDO
 #define COPYSIGN_TYPE(F) BT_FN_##F##_##F##_##F
 DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_COPYSIGN, "copysign", COPYSIGN_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef COPYSIGN_TYPE
+DEF_GCC_BUILTIN(BUILT_IN_XORSIGN, "xorsign", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_XORSIGNF, "xorsignf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_XORSIGNL, "xorsignl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define XORSIGN_TYPE(F) BT_FN_##F##_##F##_##F
+DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_XORSIGN,