[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868 Segher Boessenkool changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #11 from Segher Boessenkool --- As Andrew notes in comment #3, "addic." is not microcoded on Cell BE. I fixed this misclassification about a year ago (it used to be type "compare", now is "add"). Current trunk also does not do a load/store; all is good now.
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868 Segher Boessenkool changed: What|Removed |Added Status|NEW |ASSIGNED CC||segher at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |segher at gcc dot gnu.org --- Comment #10 from Segher Boessenkool --- We no longer generate addic. for this testcase, but that is an accident (combine first makes dec+cmp into an addic., but then also combines it with the conditional branch into a bdnz pattern; this needs splitting later, and since r218591 we no longer split to addic.). *add3_imm_{dot,dot2} should have rs6000_gen_cell_microcode in the condition. Mine.
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868 Andrew Pinski pinskia at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|NEW AssignedTo|pinskia at gcc dot gnu.org |unassigned at gcc dot ||gnu.org --- Comment #9 from Andrew Pinski pinskia at gcc dot gnu.org 2011-11-29 23:18:46 UTC --- No longer working on this.
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868 --- Comment #8 from Andrew Pinski pinskia at gcc dot gnu.org 2011-11-29 23:18:32 UTC --- No longer working on this.
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
--- Comment #7 from siarhei dot siamashka at gmail dot com 2009-11-03 20:09 --- Thanks a lot for checking this. And sorry about the confusion caused by attributing slowness of the testcase to the microcoded stuff (which turned out to be not the case) without proper checking this first. So should this bug be split into two? One about the incorrect warning, and another one about generating nonoptimal code at -O2 level (extra load and store operations, which are probably penalized by something like RAW hazard in such a short loop)? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
--- Comment #2 from pinskia at gcc dot gnu dot org 2009-11-02 16:51 --- Simple patch which I am testing right now: Index: gcc/gcc/config/rs6000/rs6000.md === --- gcc/gcc/config/rs6000/rs6000.md (revision 153680) +++ gcc/gcc/config/rs6000/rs6000.md (working copy) @@ -1627,7 +1627,7 @@ (define_insn *addmode3_internal3 (set_attr length 4,4,8,8)]) (define_split - [(set (match_operand:CC 3 cc_reg_not_cr0_operand ) + [(set (match_operand:CC 3 cc_reg_not_micro_cr0_operand ) (compare:CC (plus:P (match_operand:P 1 gpc_reg_operand ) (match_operand:P 2 reg_or_short_operand )) -- pinskia at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |pinskia at gcc dot gnu dot |dot org |org Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-11-02 16:51:40 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
--- Comment #3 from pinskia at gcc dot gnu dot org 2009-11-02 16:56 --- Actually the warning is incorrect at least according to the PPU book 4. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
--- Comment #4 from pinskia at gcc dot gnu dot org 2009-11-02 17:05 --- In fact changing the the addic. into addic/cmpwi does not improve the speed of the code: With the change: [apin...@dhcp-10-98-10-216 local]$ time ./a.out 56.316u 0.084s 0:57.09 98.7%0+0k 0+0io 0pf+0w Without: 56.276u 0.088s 0:57.08 98.7%0+0k 0+0io 0pf+0w So the warning is only invalid. With -Os on the trunk: 24.144u 0.032s 0:24.45 98.8%0+0k 0+0io 0pf+0w I don't know why off hand -Os is faster than -O2. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
--- Comment #5 from pinskia at gcc dot gnu dot org 2009-11-02 17:08 --- In fact doing the following diff to the -Os assembly: --- t5.Os.s 2009-11-02 23:18:52.0 +0900 +++ t5.Os.dot.s 2009-11-02 23:20:19.0 +0900 @@ -29,9 +29,9 @@ x: .L4: bl y .L3: - cmpwi 7,31,0 - addi 31,31,-1 - bne 7,.L4 +# cmpwi 7,31,0 + addic. 31,31,-1 + bne .L4 addi 11,1,16 b _restgpr_31_x .size x,.-x produces the same result as -Os on the trunk. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
--- Comment #6 from pinskia at gcc dot gnu dot org 2009-11-02 17:10 --- So in conclusion, addic. is not microcoded and the warning is incorrect but still -Os is faster than -O2. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
[Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
--- Comment #1 from siarhei dot siamashka at gmail dot com 2009-10-29 15:21 --- -O2: 0010 .x: 10: 2c 23 00 00 cmpdi r3,0 14: 7c 08 02 a6 mflrr0 18: f8 01 00 10 std r0,16(r1) 1c: f8 21 ff 81 stdur1,-128(r1) 20: 41 82 00 1c beq-3c .x+0x2c 24: f8 61 00 70 std r3,112(r1) 28: 48 00 00 01 bl 28 .x+0x18 2c: e8 01 00 70 ld r0,112(r1) 30: 35 20 ff ff addic. r9,r0,-1 34: f9 21 00 70 std r9,112(r1) 38: 40 82 ff f0 bne+28 .x+0x18 3c: 38 21 00 80 addir1,r1,128 40: e8 01 00 10 ld r0,16(r1) 44: 7c 08 03 a6 mtlrr0 48: 4e 80 00 20 blr 4c: 00 00 00 00 .long 0x0 50: 00 00 00 01 .long 0x1 54: 80 00 00 00 lwz r0,0(0) -Os: 0010 .x: 10: fb e1 ff f8 std r31,-8(r1) 14: 7c 08 02 a6 mflrr0 18: f8 01 00 10 std r0,16(r1) 1c: 7c 7f 1b 78 mr r31,r3 20: f8 21 ff 81 stdur1,-128(r1) 24: 48 00 00 08 b 2c .x+0x1c 28: 48 00 00 01 bl 28 .x+0x18 2c: 2f bf 00 00 cmpdi cr7,r31,0 30: 3b ff ff ff addir31,r31,-1 34: 40 9e ff f4 bne+cr7,28 .x+0x18 38: 38 21 00 80 addir1,r1,128 3c: e8 01 00 10 ld r0,16(r1) 40: eb e1 ff f8 ld r31,-8(r1) 44: 7c 08 03 a6 mtlrr0 48: 4e 80 00 20 blr 4c: 00 00 00 00 .long 0x0 50: 00 00 00 01 .long 0x1 54: 80 01 00 00 lwz r0,0(r1) -- siarhei dot siamashka at gmail dot com changed: What|Removed |Added CC||siarhei dot siamashka at ||gmail dot com Keywords||missed-optimization Summary|cell microcode instruction |cell microcode instruction |is generated for a trivial |(addic.) is generated for a |loop with -O2 optimizations,|trivial loop with -O2 |hurting performance badly |optimizations, hurting ||performance badly http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868