[Bug middle-end/39284] Computed gotos combined too aggressively
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284 Eric Gallager changed: What|Removed |Added Status|WAITING |RESOLVED CC||egallager at gcc dot gnu.org Resolution|--- |FIXED --- Comment #16 from Eric Gallager --- (In reply to Kai Tietz from comment #14) > I think we can close that bug for now. > Patch won't be backported. OK, closed
[Bug middle-end/39284] Computed gotos combined too aggressively
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284 --- Comment #15 from Richard Henderson rth at gcc dot gnu.org --- Author: rth Date: Mon Jun 30 20:14:42 2014 New Revision: 212172 URL: https://gcc.gnu.org/viewcvs?rev=212172root=gccview=rev Log: PR rtl-opt/61608 PR target/39284 * bb-reorder.c (pass_duplicate_computed_gotos::execute): Cleanup the cfg if there were any changes. * passes.def: Revert move of peephole2 after reorder_blocks; move duplicate_computed_gotos before peephole2. Modified: trunk/gcc/ChangeLog trunk/gcc/bb-reorder.c trunk/gcc/passes.def
[Bug middle-end/39284] Computed gotos combined too aggressively
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284 --- Comment #13 from Kai Tietz ktietz at gcc dot gnu.org --- Author: ktietz Date: Mon Jun 23 21:52:31 2014 New Revision: 211919 URL: https://gcc.gnu.org/viewcvs?rev=211919root=gccview=rev Log: PR target/39284 * passes.def (peephole2): Move peephole2 pass before before sched2 pass. * config/i386/i386.md (peehole2): Combine memories and indirect jumps. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md trunk/gcc/passes.def
[Bug middle-end/39284] Computed gotos combined too aggressively
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284 Kai Tietz ktietz at gcc dot gnu.org changed: What|Removed |Added Status|NEW |WAITING CC||ktietz at gcc dot gnu.org --- Comment #14 from Kai Tietz ktietz at gcc dot gnu.org --- I think we can close that bug for now. Patch won't be backported.
[Bug middle-end/39284] Computed gotos combined too aggressively
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284 Timo Kreuzer timo.kreuzer at reactos dot org changed: What|Removed |Added CC||timo.kreuzer at reactos dot org --- Comment #12 from Timo Kreuzer timo.kreuzer at reactos dot org --- Any updates on this (after 5 years)? Or any workarounds? I tried to do some optimization to my x86 instruction parser code by using computed gotos. This post (https://blogs.oracle.com/nike/entry/fast_interpreter_using_gcc_s) suggests that it should give good speed improvements and the shown compilation results looked promising. But the results with newer GCC versions were so bad, I went back to a simple switch() The 2 major speed improvements (less instructions / improved branch prediction) are completely killed by this bug. Another thing, that I find suspicious (maybe this is some crazy optimization I just don't understand) is that GCC doesn't generate a single JMP with a memory operand anymore, but first loads the memory into a register and then does a register based JMP, even when the load operation is exactly the same in all cases. From the comments it looks like there is already a working patch. Why is it not committed? I'd really appreaciate if you could fix this asap. PS: It would be great if you could make this work for switch statements in a loop as well! Normally people don't hassle with computed gotos, they use a switch. If this is in a loop and cases go back directly to the switch statement, the additional jump should be eliminated, possibly duplicating n instructions from the top of the loop before the switch.
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #11 from hubicka at gcc dot gnu dot org 2009-06-09 15:18 --- Hmm, it is not exactly load. In first case I get: (code_label 12524 16482 12523 70 1249 [4 uses]) (note 12523 12524 149 70 [bb 70] NOTE_INSN_BASIC_BLOCK) (insn:TI 149 12523 13690 70 ../src/Include/ceval-vm.i:47 (set (mem/c:SI (plus:SI (reg/f:SI 6 bp) (const_int -64 [0xffc0])) [72 %sfp+-40 S4 A32]) (const_int 1 [0x1])) 47 {*movsi_1} (expr_list:REG_EQUAL (const_int 1 [0x1]) (nil))) (insn 13690 149 1351 70 (set (reg/v:SI 0 ax [orig:155 why ] [155]) (const_int 1 [0x1])) 47 {*movsi_1} (nil)) (code_label 1351 13690 1352 71 382 [0 uses]) (note 1352 1351 1353 71 [bb 71] NOTE_INSN_BASIC_BLOCK) (jump_insn:TI 1353 1352 1354 71 ../src/Python/ceval.c:1000 (set (pc) (mem/c:SI (plus:SI (reg/f:SI 6 bp) (const_int -60 [0xffc4])) [72 %sfp+-36 S4 A32])) 640 {*indirect_jump} (nil)) (barrier 1354 1353 1477) So there are 4 edges reaching WHY set. In the second case it is move of WHY to 1: (code_label 1363 1365 1349 150 384 [127 uses]) (note 1349 1363 1350 150 [bb 150] NOTE_INSN_BASIC_BLOCK) (insn:TI 1350 1349 19980 150 ../src/Python/ceval.c:1000 (set (reg/v:SI 0 ax [orig:155 why ] [155]) (const_int 1 [0x1])) 47 {*movsi_1} (nil)) (note 19980 1350 19979 151 [bb 151] NOTE_INSN_BASIC_BLOCK) (jump_insn 19979 19980 19982 151 ../src/Python/ceval.c:1000 (set (pc) (mem/c:SI (plus:SI (reg/f:SI 6 bp) (const_int -60 [0xffc4])) [72 %sfp+-36 S4 A32])) 640 {*indirect_jump} (nil)) (barrier 19982 19979 1373) that prevents duplicating. Probably ordirnary bb-reorder should be convincable to handle this well? This don't seem to happen at 64bit compilation. I also posted the patch fixing optimize_for_size check -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #8 from hubicka at gcc dot gnu dot org 2009-06-08 20:55 --- Hmm, the conditional is bogus, there should not be ! but still after patching this we don't duplicate. The reason is that the BB 71 (containing conditional jump) is reached via 2 BBs containing memory load. I guess it is result of crossjumping. I will send patch fixing the conditional, but it makes little difference. Honza -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #9 from steven at gcc dot gnu dot org 2009-06-08 22:43 --- There shouldn't be a !. Just make use optimize_function_for_speed_p(cfun). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #10 from steven at gcc dot gnu dot org 2009-06-08 22:45 --- Honza, what do the basic blocks 2 and 71 look like for you, exactly? I see no memory load. But I have local crossjumping patches -- as you know ;-) -- so I am probably not looking at the same dumps as you are. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #6 from rguenth at gcc dot gnu dot org 2009-02-24 10:17 --- rguent...@murzim:/tmp gcc-4.4 -m32 -fno-strict-aliasing -fwrapv -O3 --param max-goto-duplication-insns=10 -S ceval.i rguent...@murzim:/tmp egrep -c 'jmp[[:space:]]*\*' ceval.s 4 rguent...@murzim:/tmp gcc-4.3 -m32 -fno-strict-aliasing -fwrapv -O3 --param max-goto-duplication-insns=10 -S ceval.i rguent...@murzim:/tmp egrep -c 'jmp[[:space:]]*\*' ceval.s 5 rguent...@murzim:/tmp gcc-4.2 -m32 -fno-strict-aliasing -fwrapv -O3 --param max-goto-duplication-insns=10 -S ceval.i rguent...@murzim:/tmp egrep -c 'jmp[[:space:]]*\*' ceval.s 5 rguent...@murzim:/tmp gcc-4.1 -m32 -fno-strict-aliasing -fwrapv -O3 --param max-goto-duplication-insns=10 -S ceval.i rguent...@murzim:/tmp egrep -c 'jmp[[:space:]]*\*' ceval.s 19 How many gotos do you expect? -- rguenth at gcc dot gnu dot org changed: What|Removed |Added CC||hubicka at gcc dot gnu dot ||org Keywords||missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #7 from jyasskin at gmail dot com 2009-02-24 15:26 --- I'd like to get gcc not to combine any of them, which I believe would produce 130, as many as the asm volatiles that survived optimization. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #2 from pinskia at gcc dot gnu dot org 2009-02-23 22:46 --- This is by design; GCSE is the one which pulls back the computed gotos IIRC. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Component|c |middle-end http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #3 from pinskia at gcc dot gnu dot org 2009-02-23 22:48 --- It says may but not will get better performance. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #4 from jyasskin at gmail dot com 2009-02-23 22:58 --- Taking out -fno-gcse doesn't change the result. $ gcc-4.4 -m32 -pthread -fno-strict-aliasing -g -fwrapv -O3 --param max-goto-duplication-insns=10 -S -dA ceval.i -o ceval.s $ egrep -c 'jmp[[:space:]]*\*' ceval.s 4 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284
[Bug middle-end/39284] Computed gotos combined too aggressively
--- Comment #5 from steven at gcc dot gnu dot org 2009-02-23 23:19 --- Should unfactor. We have pass_duplicate_computed_gotos for this. We should look into this, see why it doesn't work. Someone added a optimize_for_size_p() check in duplicate_computed_gotos(). That is just *stupid*. If there should be a check like that, it should be per function, not per basic block. I bet that's the cause. -- steven at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-02-23 23:19:12 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39284