[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 Bernd Schmidt changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #14 from Bernd Schmidt --- Still fixed.
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #13 from Bernd Schmidt --- Author: bernds Date: Mon Nov 28 08:59:01 2016 New Revision: 242908 URL: https://gcc.gnu.org/viewcvs?rev=242908=gcc=rev Log: PR rtl-optimization/78120 * rtlanal.c (insn_rtx_cost): Revert previous change. Modified: trunk/gcc/ChangeLog trunk/gcc/rtlanal.c
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #12 from Bernd Schmidt --- Author: bernds Date: Thu Nov 24 12:22:16 2016 New Revision: 242834 URL: https://gcc.gnu.org/viewcvs?rev=242834=gcc=rev Log: PR rtl-optimization/78120 * ifcvt.c (noce_conversion_profitable_p): Check original cost in all cases, and additionally test against max_seq_cost for speed optimization. (noce_process_if_block): Compute an estimate for the original cost when optimizing for speed, using the minimum of then and else block costs. testsuite/ PR rtl-optimization/78120 * gcc.target/i386/pr78120.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr78120.c Modified: trunk/gcc/ChangeLog trunk/gcc/ifcvt.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #11 from Bernd Schmidt --- Author: bernds Date: Thu Nov 24 12:17:52 2016 New Revision: 242833 URL: https://gcc.gnu.org/viewcvs?rev=242833=gcc=rev Log: PR rtl-optimization/78120 * rtlanal.c (insn_rtx_cost): Use set_rtx_cost. Modified: trunk/gcc/ChangeLog trunk/gcc/rtlanal.c
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #10 from Bernd Schmidt --- Author: bernds Date: Thu Nov 24 12:16:47 2016 New Revision: 242832 URL: https://gcc.gnu.org/viewcvs?rev=242832=gcc=rev Log: PR rtl-optimization/78120 * config/i386/i386.c (ix86_rtx_costs): Fully handle SETs. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #9 from James Greenhalgh --- I haven't had a chance to bootstrap or benchmark it, but your patch looks reasonable to me.
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #8 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #0) > typedef unsigned long u64; > > typedef struct { > u64 hi, lo; > } u128; > > static inline u128 add_u128 (u128 a, u128 b) > { > a.lo += b.lo; > if (a.lo < b.lo) > a.hi++; > > return a; While not connected to this problem, it has been pointed out that a.hi += b.hi; is missing in the above source, if we want to implement true 128bit addition. So, if we want to be pedantic, the missing line should be added to the test. It doesn't change the outcome in any way.
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #7 from Bernd Schmidt --- Sorry James, I think these two got mixed up in my memory. I've attached a candidate patch I'm testing. This tries to make a better effort to calculate before/after costs for the speed case so we don't rely entirely on max_seq_cost. I'd be interested in whether it produces good results on ARM (or even on x86) if someone is set up to run benchmarks.
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #6 from Bernd Schmidt --- Created attachment 40028 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40028=edit Candidate patch
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 James Greenhalgh changed: What|Removed |Added CC||ktkachov at gcc dot gnu.org --- Comment #5 from James Greenhalgh --- If this is a 6 regression it may be more likely related to Kyrill's work on if-converting arithmetic instructions ( r227368 and follow-ups) than my cost work (which only landed for GCC 7) - though I'm sure my cost work won't have helped as the impact of my changes would likely cause less if-conversion for x86, which reports branches as cheap and conditional moves as expensive. My ifcvt work for GCC 6 was more related to multiple sets in a basic block, which I don't think this testcase requires.
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #4 from Bernd Schmidt --- The other issue here seems to be simply that BRANCH_COST is 0 for predictable branches on x86. Which makes the default implementation of the hook used here if_info.max_seq_cost = targetm.max_noce_ifcvt_seq_cost (then_edge); come out as zero, and we fail the conversion because of this.
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 --- Comment #3 from Bernd Schmidt --- Gah, that's obviously bogus and fails elsewhere. Still looking...
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 Bernd Schmidt changed: What|Removed |Added Status|NEW |ASSIGNED CC||bernds at gcc dot gnu.org, ||jgreenhalgh at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |bernds at gcc dot gnu.org --- Comment #2 from Bernd Schmidt --- Looks like a mismatch where we use different cost calculations to compare before/after sequences. The following seems to make it come out right, but it probably needs a lot of testing to make sure it behaves as intended. (James?) Index: gcc/rtlanal.c === --- gcc/rtlanal.c (revision 242038) +++ gcc/rtlanal.c (working copy) @@ -5224,13 +5224,8 @@ seq_cost (const rtx_insn *seq, bool spee rtx set; for (; seq; seq = NEXT_INSN (seq)) -{ - set = single_set (seq); - if (set) -cost += set_rtx_cost (set, speed); - else -cost++; -} +if (NONDEBUG_INSN_P (seq)) + cost += insn_rtx_cost (set, speed); return cost; }
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |6.3
[Bug rtl-optimization/78120] [6/7 Regression] If conversion no longer performed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78120 Marc Glisse changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2016-10-26 Ever confirmed|0 |1 --- Comment #1 from Marc Glisse --- First I thought that the widening_mul pass detected the add_overflow pattern too late for later optimization to happen, but even if I write directly: if(__builtin_add_overflow(a.lo,b.lo,)) a.hi++; nothing (even with -Ofast -ftree-loop-if-convert-stores) turns it into the equivalent of a.hi+=__builtin_add_overflow(a.lo,b.lo,); which anyway also generates a mix of branch and setc instead of using the carry as one might hope...