[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-28 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

Bernd Schmidt  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Bernd Schmidt  ---
Still fixed.

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-28 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #13 from Bernd Schmidt  ---
Author: bernds
Date: Mon Nov 28 08:59:01 2016
New Revision: 242908

URL: https://gcc.gnu.org/viewcvs?rev=242908=gcc=rev
Log:
PR rtl-optimization/78120
* rtlanal.c (insn_rtx_cost): Revert previous change.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/rtlanal.c

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-24 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #12 from Bernd Schmidt  ---
Author: bernds
Date: Thu Nov 24 12:22:16 2016
New Revision: 242834

URL: https://gcc.gnu.org/viewcvs?rev=242834=gcc=rev
Log:
PR rtl-optimization/78120
* ifcvt.c (noce_conversion_profitable_p): Check original cost in all
cases, and additionally test against max_seq_cost for speed
optimization.
(noce_process_if_block): Compute an estimate for the original cost when
optimizing for speed, using the minimum of then and else block costs.

testsuite/
PR rtl-optimization/78120
* gcc.target/i386/pr78120.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/i386/pr78120.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ifcvt.c
trunk/gcc/testsuite/ChangeLog

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-24 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #11 from Bernd Schmidt  ---
Author: bernds
Date: Thu Nov 24 12:17:52 2016
New Revision: 242833

URL: https://gcc.gnu.org/viewcvs?rev=242833=gcc=rev
Log:
PR rtl-optimization/78120
* rtlanal.c (insn_rtx_cost): Use set_rtx_cost.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/rtlanal.c

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-24 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #10 from Bernd Schmidt  ---
Author: bernds
Date: Thu Nov 24 12:16:47 2016
New Revision: 242832

URL: https://gcc.gnu.org/viewcvs?rev=242832=gcc=rev
Log:
PR rtl-optimization/78120
* config/i386/i386.c (ix86_rtx_costs): Fully handle SETs.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-14 Thread jgreenhalgh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #9 from James Greenhalgh  ---
I haven't had a chance to bootstrap or benchmark it, but your patch looks
reasonable to me.

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-11 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #8 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #0)

> typedef unsigned long u64;
> 
> typedef struct {
>   u64 hi, lo;
> } u128;
> 
> static inline u128 add_u128 (u128 a, u128 b)
> {
>   a.lo += b.lo;
>   if (a.lo < b.lo)
> a.hi++;
> 
>   return a;

While not connected to this problem, it has been pointed out that 

a.hi += b.hi;

is missing in the above source, if we want to implement true 128bit addition.

So, if we want to be pedantic, the missing line should be added to the test. It
doesn't change the outcome in any way.

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-11 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #7 from Bernd Schmidt  ---
Sorry James, I think these two got mixed up in my memory.

I've attached a candidate patch I'm testing. This tries to make a better effort
to calculate before/after costs for the speed case so we don't rely entirely on
max_seq_cost. I'd be interested in whether it produces good results on ARM (or
even on x86) if someone is set up to run benchmarks.

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-11 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #6 from Bernd Schmidt  ---
Created attachment 40028
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40028=edit
Candidate patch

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-11 Thread jgreenhalgh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

James Greenhalgh  changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org

--- Comment #5 from James Greenhalgh  ---
If this is a 6 regression it may be more likely related to Kyrill's work on
if-converting arithmetic instructions ( r227368 and follow-ups) than my cost
work (which only landed for GCC 7) - though I'm sure my cost work won't have
helped as the impact of my changes would likely cause less if-conversion for
x86, which reports branches as cheap and conditional moves as expensive.

My ifcvt work for GCC 6 was more related to multiple sets in a basic block,
which I don't think this testcase requires.

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-11 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #4 from Bernd Schmidt  ---
The other issue here seems to be simply that BRANCH_COST is 0 for predictable
branches on x86. Which makes the default implementation of the hook used here

  if_info.max_seq_cost
= targetm.max_noce_ifcvt_seq_cost (then_edge);

come out as zero, and we fail the conversion because of this.

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-11 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

--- Comment #3 from Bernd Schmidt  ---
Gah, that's obviously bogus and fails elsewhere. Still looking...

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-11-11 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

Bernd Schmidt  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||bernds at gcc dot gnu.org,
   ||jgreenhalgh at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |bernds at gcc dot 
gnu.org

--- Comment #2 from Bernd Schmidt  ---
Looks like a mismatch where we use different cost calculations to compare
before/after sequences. The following seems to make it come out right, but it
probably needs a lot of testing to make sure it behaves as intended. (James?)

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 242038)
+++ gcc/rtlanal.c   (working copy)
@@ -5224,13 +5224,8 @@ seq_cost (const rtx_insn *seq, bool spee
   rtx set;

   for (; seq; seq = NEXT_INSN (seq))
-{
-  set = single_set (seq);
-  if (set)
-cost += set_rtx_cost (set, speed);
-  else
-cost++;
-}
+if (NONDEBUG_INSN_P (seq))
+  cost += insn_rtx_cost (set, speed);

   return cost;
 }

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-10-26 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |6.3

[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed

2016-10-26 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120

Marc Glisse  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-10-26
 Ever confirmed|0   |1

--- Comment #1 from Marc Glisse  ---
First I thought that the widening_mul pass detected the add_overflow pattern
too late for later optimization to happen, but even if I write directly:
  if(__builtin_add_overflow(a.lo,b.lo,))
a.hi++;
nothing (even with -Ofast -ftree-loop-if-convert-stores) turns it into the
equivalent of
  a.hi+=__builtin_add_overflow(a.lo,b.lo,);
which anyway also generates a mix of branch and setc instead of using the carry
as one might hope...