On Thu, 3 Aug 2017, Tamar Christina wrote:

> Hi All,
> 
> this patch implements a optimization rewriting
> 
> x * copysign (1.0, y)
> 
> to:
> 
> x ^ (y & (1 << sign_bit_position))
> 
> 
> This is only done when not honoring signaling NaNs.
> This transormation is done at ssa mult widening time and
> is gated on the a check for the optab "xorsign".
> If the optab is not available then copysign is expanded as normal.
> 
> If the optab exists then the expression is replaced with
> a call to the internal function XORSIGN.
> 
> This patch is a revival of a previous patches
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00749.html
> 
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
> 
> Ok for trunk?

+DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary)
+

please avoid adding spurious vertical space

+is_copysign_call_with_1 (gimple *call)
+{
+  if (!is_gimple_call (call))
+    return false;
+
+  enum combined_fn code = gimple_call_combined_fn (call);
+
+  if (code == CFN_LAST)
+    return false;
+
+  gcall *c = as_a<gcall*> (call);

A better idiom would be

is_copysign_call_with_1 (gimple *call)
{
  gcall *c = dyn_cast <gcall *> (call);
  if (! c)
    return false;

+  if (TREE_CODE (treeop0) == SSA_NAME && TREE_CODE (treeop1) == SSA_NAME)
+    {
+      gimple *call0 = SSA_NAME_DEF_STMT (treeop0);
+      if (!is_copysign_call_with_1 (call0))
+       {
+         /* IPA sometimes inlines and then extracts the function again,
+            resulting in an incorrect order, so check both ways.  */
+         call0 = SSA_NAME_DEF_STMT (treeop1);

It can happen in both order dependent on SSA name versioning so the
comment is misleading - please drop it.

+       gcall *call_stmt
+         = gimple_build_call_internal (IFN_XORSIGN, 2, treeop1,treeop0);

space before treeop0 missing.

+       gimple_set_lhs (call_stmt, lhs);
+       gimple_set_location (call_stmt, loc);
+       gsi_insert_after (gsi, call_stmt, GSI_SAME_STMT);
+       return true;

please do gsi_replace (gsi, call_stmt, true) instead of inserting after
the original stmt and ...

@@ -4122,6 +4212,11 @@ pass_optimize_widening_mul::execute (function *fun)
                      release_defs (stmt);
                      continue;
                    }
+                if (convert_expand_mult_copysign (stmt, &gsi))
+                   {
+                     gsi_remove (&gsi, true);
+                     continue;
+                   }
                  break;

handle this like convert_mult_to_widen, thus

              switch (code)
                {
                case MULT_EXPR:
                  if (!convert_mult_to_widen (stmt, &gsi)
                      && !convert_expand_mult_copysign (stmt, &gsi)
                      && convert_mult_to_fma (stmt,
                                              gimple_assign_rhs1 (stmt),
                                              gimple_assign_rhs2 (stmt)))

given you likely do not want x * copysign (1, y) + z to be transformed
into FMA but still use xorsign?

I am also eventually missing a single-use check on the copysign
result.  If the result of copysign is used in multiple places
do we want to keep the multiply or is the xorsign much cheaper?
The only eventual disadvantage would be higher register pressure
if y and copysing (1, y) were not live at the same time before.

Thanks,
Richard.


> gcc/
> 2017-08-03  Tamar Christina  <tamar.christ...@arm.com>
>           Andrew Pinski <pins...@gmail.com>
> 
>       PR middle-end/19706
>       * internal-fn.def (XORSIGN): New.
>       * optabs.def (xorsign_optab): New.
>       * tree-ssa-math-opts.c (is_copysign_call_with_1): New.
>       (convert_expand_mult_copysign): New.
>       (pass_optimize_widening_mul::execute): Call 
> convert_expand_mult_copysign.
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to