Re: [PATCH] Ensure noce_convert_multiple_sets handles only multiple sets (PR rtl-optimization/69570)

2016-02-01 Thread Richard Biener
On February 1, 2016 9:26:38 PM GMT+01:00, Jakub Jelinek  
wrote:
>On Mon, Feb 01, 2016 at 09:39:19AM +0100, Steven Bosscher wrote:
>> Browny points for opting out of the loop over all insns in the basic
>> block when count > limit.
>
>Like this?
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

>2016-02-01  Jakub Jelinek  
>
>   * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Return false
>   when count is incremented above limit, don't analyze further
>   insns afterwards.
>
>--- gcc/ifcvt.c.jj 2016-02-01 09:46:00.0 +0100
>+++ gcc/ifcvt.c2016-02-01 12:33:28.932281244 +0100
>@@ -3286,15 +3286,13 @@ bb_ok_for_noce_convert_multiple_sets (ba
>   if (!can_conditionally_move_p (GET_MODE (dest)))
>   return false;
> 
>-  ++count;
>+  /* FORNOW: Our cost model is a count of the number of
>instructions we
>+   would if-convert.  This is suboptimal, and should be improved as
>part
>+   of a wider rework of branch_cost.  */
>+  if (++count > limit)
>+  return false;
> }
> 
>-  /* FORNOW: Our cost model is a count of the number of instructions
>we
>- would if-convert.  This is suboptimal, and should be improved as
>part
>- of a wider rework of branch_cost.  */
>-  if (count > limit)
>-return false;
>-
>   return count > 1;
> }
> 
>
>
>   Jakub




Re: [PATCH] Ensure noce_convert_multiple_sets handles only multiple sets (PR rtl-optimization/69570)

2016-02-01 Thread Jakub Jelinek
On Mon, Feb 01, 2016 at 09:39:19AM +0100, Steven Bosscher wrote:
> Browny points for opting out of the loop over all insns in the basic
> block when count > limit.

Like this?
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-02-01  Jakub Jelinek  

* ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Return false
when count is incremented above limit, don't analyze further
insns afterwards.

--- gcc/ifcvt.c.jj  2016-02-01 09:46:00.0 +0100
+++ gcc/ifcvt.c 2016-02-01 12:33:28.932281244 +0100
@@ -3286,15 +3286,13 @@ bb_ok_for_noce_convert_multiple_sets (ba
   if (!can_conditionally_move_p (GET_MODE (dest)))
return false;
 
-  ++count;
+  /* FORNOW: Our cost model is a count of the number of instructions we
+would if-convert.  This is suboptimal, and should be improved as part
+of a wider rework of branch_cost.  */
+  if (++count > limit)
+   return false;
 }
 
-  /* FORNOW: Our cost model is a count of the number of instructions we
- would if-convert.  This is suboptimal, and should be improved as part
- of a wider rework of branch_cost.  */
-  if (count > limit)
-return false;
-
   return count > 1;
 }
 


Jakub


Re: [PATCH] Ensure noce_convert_multiple_sets handles only multiple sets (PR rtl-optimization/69570)

2016-02-01 Thread Steven Bosscher
On Mon, Feb 1, 2016 at 9:32 AM, Jakub Jelinek  wrote:
> Bootstrapped/regtested on 
> {x86_64,i686,ppc64,ppc64le,s390,s390x,aarch64}-linux,
> ok for trunk?

OK.

Browny points for opting out of the loop over all insns in the basic
block when count > limit.

Ciao!
Steven


[PATCH] Ensure noce_convert_multiple_sets handles only multiple sets (PR rtl-optimization/69570)

2016-02-01 Thread Jakub Jelinek
Hi!

While looking at this PR (which is most likely a reg-stack or RA bug
triggered by the ifcvt noce_convert_multiple_sets additions), I've noticed
that despite the "multiple_sets" in the name it actually attempts to handle
not just multiple sets, but also the single set case, which is already
handled by various calls later on noce_process_if_block.  Additionally,
it handles it worse than those calls we had for years, because it always
creates temporary pseudos to store result into and then at the end copy back
to the desired register.  If there is just a single set, this temporary is
unnecessary and unfortunately negatively affects RA (get larger code with
more spills/fills in *.reload/postreload).

So, I'd like to change noce_convert_multiple_sets to really apply to
multiple sets only.  While it makes the issue latent again (and I'll try to
analyze it), IMHO it is the right step forward.

Bootstrapped/regtested on {x86_64,i686,ppc64,ppc64le,s390,s390x,aarch64}-linux,
ok for trunk?

2016-02-01  Jakub Jelinek  

PR rtl-optimization/69570
* ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Return true only
if there is more than one set, not if there is a single set.

* g++.dg/opt/pr69570.C: New test.

--- gcc/ifcvt.c.jj  2016-01-21 17:53:32.0 +0100
+++ gcc/ifcvt.c 2016-01-31 13:47:34.171323086 +0100
@@ -3295,7 +3295,7 @@ bb_ok_for_noce_convert_multiple_sets (ba
   if (count > limit)
 return false;
 
-  return count > 0;
+  return count > 1;
 }
 
 /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert
--- gcc/testsuite/g++.dg/opt/pr69570.C.jj   2016-01-31 22:49:03.747216450 
+0100
+++ gcc/testsuite/g++.dg/opt/pr69570.C  2016-01-31 22:49:18.861009011 +0100
@@ -0,0 +1,70 @@
+// PR rtl-optimization/69570
+// { dg-do run }
+// { dg-options "-O2" }
+// { dg-additional-options "-fpic" { target fpic } }
+// { dg-additional-options "-march=i686" { target ia32 } }
+
+template  inline const T &
+min (const T , const T )
+{
+  if (b < a)
+return b;
+  return a;
+}
+
+template  inline const T &
+max (const T , const T )
+{
+  if (a < b)
+return b;
+  return a;
+}
+
+static inline void
+foo (unsigned x, unsigned y, unsigned z, double , double , double )
+{
+  double r = x / 255.0;
+  double g = y / 255.0;
+  double b = z / 255.0;
+  double m = max (r, max (g, b));
+  double n = min (r, min (g, b));
+  double d = m - n;
+  double e = m + n;
+  h = 0.0, s = 0.0, l = e / 2.0;
+  if (d > 0.0)
+{
+  s = l > 0.5 ? d / (2.0 - e) : d / e;
+  if (m == r && m != g)
+h = (g - b) / d + (g < b ? 6.0 : 0.0);
+  if (m == g && m != b)
+h = (b - r) / d + 2.0;
+  if (m == b && m != r)
+h = (r - g) / d + 4.0;
+  h /= 6.0;
+}
+}
+
+__attribute__ ((noinline, noclone))
+void bar (unsigned x[3], double y[3])
+{
+  double h, s, l;
+  foo (x[0], x[1], x[2], h, s, l);
+  y[0] = h;
+  y[1] = s;
+  y[2] = l;
+}
+
+int
+main ()
+{
+  unsigned x[3] = { 0, 128, 0 };
+  double y[3];
+
+  bar (x, y);
+  if (__builtin_fabs (y[0] - 0.3) > 0.001
+  || __builtin_fabs (y[1] - 1) > 0.001
+  || __builtin_fabs (y[2] - 0.25098) > 0.001)
+__builtin_abort ();
+
+  return 0;
+}

Jakub