[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 Jakub Jelinek changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #11 from Jakub Jelinek --- Fixed.
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 --- Comment #10 from Jakub Jelinek --- Author: jakub Date: Sat Feb 27 06:43:20 2016 New Revision: 233777 URL: https://gcc.gnu.org/viewcvs?rev=233777&root=gcc&view=rev Log: PR rtl-optimization/69896 * tree-vect-generic.c (get_compute_type): Avoid single element vector types. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-vect-generic.c
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 --- Comment #9 from Jakub Jelinek --- Created attachment 37810 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37810&action=edit gcc6-pr69896.patch And here is the unfinished BIT_FIELD_REF folding patch. This fixes one issue, where we are asking for a 1x vector out of say 2x vector, and we get incorrectly the element instead of { element }. But that just seems to be a tip of an iceberg - we can then end up with say VECTOR_CST with a single element inside of a CONSTRUCTOR and might be assuming that we get the element type instead of the 1x vector, etc. I'd say we really should avoid the 1x vectors at the tree/gimple levels as much as possible.
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 --- Comment #8 from Jakub Jelinek --- Created attachment 37809 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37809&action=edit gcc6-pr69896.patch Ah, managed to reproduce. I have one patch and one partial incomplete one. It doesn't make any sense to me to use TYPE_VECTOR_SUBPARTS () == 1 vectors for the computation if it is the widest supported vector type, we should just use the element type instead. So that is what this patch does.
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 --- Comment #7 from Jakub Jelinek --- Can't reproduce such ICE (with a cross). That said, this was a RTL bug, so if it ICEs during gimple verification, IMHO it should be tracked in a different PR.
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 Segher Boessenkool changed: What|Removed |Added Status|RESOLVED|REOPENED CC||segher at gcc dot gnu.org Resolution|FIXED |--- --- Comment #6 from Segher Boessenkool --- This new testcase fails on powerpc64le-linux: 69896.c:16:1: internal compiler error: verify_gimple failed 0x1089ffb3 verify_gimple_in_cfg(function*, bool) /home/segher/src/gcc/gcc/tree-cfg.c:5082 0x107412c3 execute_function_todo /home/segher/src/gcc/gcc/passes.c:1958 0x10741eb3 do_per_function /home/segher/src/gcc/gcc/passes.c:1645 0x107421df execute_todo /home/segher/src/gcc/gcc/passes.c:2010
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Jakub Jelinek --- Fixed.
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 --- Comment #4 from Jakub Jelinek --- Author: jakub Date: Thu Feb 25 08:09:02 2016 New Revision: 233692 URL: https://gcc.gnu.org/viewcvs?rev=233692&root=gcc&view=rev Log: PR rtl-optimization/69896 * regcprop.c: Include cfgrtl.h. (copyprop_hardreg_forward_1): If noop_p insn uses narrower than remembered mode, either delete it (if noop_move_p), or treat like copy_p but not noop_p instruction. * gcc.dg/pr69896.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr69896.c Modified: trunk/gcc/ChangeLog trunk/gcc/regcprop.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Created attachment 37757 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37757&action=edit gcc6-pr69896.patch Untested fix. regcprop (which is invoked on the first bb by prepare_shrink_wrap) ignores noop_p moves, which is fine if they are done in the mode we remember for the register, but if it is narrower, it is wrong. The patch removes them if DCE would remove them, and keeps around otherwise (but in that case makes sure we actually update the mode etc.).
[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896 Jakub Jelinek changed: What|Removed |Added CC||bernds at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Yeah, in *.postreload we had: (insn 250 28 30 2 (set (reg:TI 3 bx [235]) (mem/j/c:TI (plus:DI (reg/f:DI 7 sp) (const_int 704 [0x2c0])) [1 v32u128_1+0 S16 A256])) pr69896.c:13 84 {*movti_internal} (nil)) (insn 30 250 273 2 (set (reg:TI 40 r11 [orig:98 _15 ] [98]) (reg:TI 3 bx [235])) pr69896.c:13 84 {*movti_internal} (nil)) (note 273 30 31 2 NOTE_INSN_DELETED) (insn 31 273 32 2 (set (reg:SI 3 bx [orig:99 _16 ] [99]) (reg:SI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal} (nil)) ... (insn 271 37 278 2 (set (mem/c:TI (plus:DI (reg/f:DI 7 sp) (const_int 32 [0x20])) [6 %sfp+-352 S16 A128]) (reg:TI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 84 {*movti_internal} (nil)) (uses of reg:SI 3 bx [orig:99 _16 ] [99]) were removed during *.postreload). Then in *.split2 we get: (insn 279 28 280 2 (set (reg:DI 3 bx [235]) (mem/j/c:DI (plus:DI (reg/f:DI 7 sp) (const_int 704 [0x2c0])) [1 v32u128_1+0 S8 A256])) pr69896.c:13 85 {*movdi_internal} (nil)) (insn 280 279 281 2 (set (reg:DI 4 si [+8 ]) (mem/j/c:DI (plus:DI (reg/f:DI 7 sp) (const_int 712 [0x2c8])) [1 v32u128_1+8 S8 A64])) pr69896.c:13 85 {*movdi_internal} (nil)) (insn 281 280 282 2 (set (reg:DI 40 r11 [orig:98 _15 ] [98]) (reg:DI 3 bx [235])) pr69896.c:13 85 {*movdi_internal} (nil)) (insn 282 281 273 2 (set (reg:DI 41 r12 [ _15+8 ]) (reg:DI 4 si [+8 ])) pr69896.c:13 85 {*movdi_internal} (nil)) (note 273 282 31 2 NOTE_INSN_DELETED) (insn 31 273 32 2 (set (reg:SI 3 bx [orig:99 _16 ] [99]) (reg:SI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal} (nil)) ... (insn 283 37 284 2 (set (mem/c:DI (plus:DI (reg/f:DI 7 sp) (const_int 32 [0x20])) [6 %sfp+-352 S8 A128]) (reg:DI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 85 {*movdi_internal} (nil)) (insn 284 283 278 2 (set (mem/c:DI (plus:DI (reg/f:DI 7 sp) (const_int 40 [0x28])) [6 %sfp+-344 S8 A64]) (reg:DI 41 r12 [ _15+8 ])) pr69896.c:13 85 {*movdi_internal} (nil)) I believe it is the *.pro_and_epilogue pass that makes the invalid transformation: (insn 31 273 32 2 (set (reg:SI 3 bx [orig:99 _16 ] [99]) -(reg:SI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal} +(reg:SI 3 bx [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal} (nil)) is fine, but (insn 283 37 284 2 (set (mem/c:DI (plus:DI (reg/f:DI 7 sp) (const_int 32 [0x20])) [6 %sfp+-352 S8 A128]) -(reg:DI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 85 {*movdi_internal} +(reg:DI 3 bx [orig:98 _15 ] [98])) pr69896.c:13 85 {*movdi_internal} (nil)) is wrong, unless insn 31 is removed or turned into DImode assignment instead of SImode. (insn 284 283 278 2 (set (mem/c:DI (plus:DI (reg/f:DI 7 sp) (const_int 40 [0x28])) [6 %sfp+-344 S8 A64]) -(reg:DI 41 r12 [ _15+8 ])) pr69896.c:13 85 {*movdi_internal} +(reg:DI 4 si [orig:41 _15+8 ] [41])) pr69896.c:13 85 {*movdi_internal} (nil)) (note 278 284 251 2 NOTE_INSN_DELETED) (insn 251 278 38 2 (set (reg:SI 4 si [236]) -(reg:SI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal} +(reg:SI 3 bx [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal} (nil)) (insn 38 251 39 2 (set (mem/c:SI (plus:DI (reg/f:DI 7 sp) (const_int 88 [0x58])) [3 S4 A64]) -(reg:SI 4 si [236])) pr69896.c:13 86 {*movsi_internal} +(reg:SI 3 bx [236])) pr69896.c:13 86 {*movsi_internal} (nil)) are all fine. -fno-shrink-wrapping indeed fixes this, as the invalid transformation is not performed.