[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

Jakub Jelinek  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Jakub Jelinek  ---
Fixed.

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

--- Comment #10 from Jakub Jelinek  ---
Author: jakub
Date: Sat Feb 27 06:43:20 2016
New Revision: 233777

URL: https://gcc.gnu.org/viewcvs?rev=233777&root=gcc&view=rev
Log:
PR rtl-optimization/69896
* tree-vect-generic.c (get_compute_type): Avoid single element
vector types.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-vect-generic.c

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

--- Comment #9 from Jakub Jelinek  ---
Created attachment 37810
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37810&action=edit
gcc6-pr69896.patch

And here is the unfinished BIT_FIELD_REF folding patch.  This fixes one issue,
where we are asking for a 1x vector out of say 2x vector, and we get
incorrectly the element instead of { element }.  But that just seems to be a
tip of an iceberg - we can then end up with say VECTOR_CST with a single
element inside of a CONSTRUCTOR and might be assuming that we get the element
type instead of the 1x vector, etc.  I'd say we really should avoid the 1x
vectors at the tree/gimple levels as much as possible.

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

--- Comment #8 from Jakub Jelinek  ---
Created attachment 37809
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37809&action=edit
gcc6-pr69896.patch

Ah, managed to reproduce.  I have one patch and one partial incomplete one.
It doesn't make any sense to me to use TYPE_VECTOR_SUBPARTS () == 1 vectors
for the computation if it is the widest supported vector type, we should just
use the element type instead.  So that is what this patch does.

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

--- Comment #7 from Jakub Jelinek  ---
Can't reproduce such ICE (with a cross).
That said, this was a RTL bug, so if it ICEs during gimple verification, IMHO
it should be tracked in a different PR.

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-25 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

Segher Boessenkool  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 CC||segher at gcc dot gnu.org
 Resolution|FIXED   |---

--- Comment #6 from Segher Boessenkool  ---
This new testcase fails on powerpc64le-linux:

69896.c:16:1: internal compiler error: verify_gimple failed
0x1089ffb3 verify_gimple_in_cfg(function*, bool)
/home/segher/src/gcc/gcc/tree-cfg.c:5082
0x107412c3 execute_function_todo
/home/segher/src/gcc/gcc/passes.c:1958
0x10741eb3 do_per_function
/home/segher/src/gcc/gcc/passes.c:1645
0x107421df execute_todo
/home/segher/src/gcc/gcc/passes.c:2010

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jakub Jelinek  ---
Fixed.

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

--- Comment #4 from Jakub Jelinek  ---
Author: jakub
Date: Thu Feb 25 08:09:02 2016
New Revision: 233692

URL: https://gcc.gnu.org/viewcvs?rev=233692&root=gcc&view=rev
Log:
PR rtl-optimization/69896
* regcprop.c: Include cfgrtl.h.
(copyprop_hardreg_forward_1): If noop_p insn uses narrower
than remembered mode, either delete it (if noop_move_p), or
treat like copy_p but not noop_p instruction.

* gcc.dg/pr69896.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr69896.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/regcprop.c
trunk/gcc/testsuite/ChangeLog

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Created attachment 37757
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37757&action=edit
gcc6-pr69896.patch

Untested fix.  regcprop (which is invoked on the first bb by
prepare_shrink_wrap) ignores noop_p moves, which is fine if they are done in
the mode we remember for the register, but if it is narrower, it is wrong.
The patch removes them if DCE would remove them, and keeps around otherwise
(but in that case makes sure we actually update the mode etc.).

[Bug rtl-optimization/69896] [6 Regression] wrong code with -frename-registers @ x64_64

2016-02-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69896

Jakub Jelinek  changed:

   What|Removed |Added

 CC||bernds at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Yeah, in *.postreload we had:
(insn 250 28 30 2 (set (reg:TI 3 bx [235])
(mem/j/c:TI (plus:DI (reg/f:DI 7 sp)
(const_int 704 [0x2c0])) [1 v32u128_1+0 S16 A256]))
pr69896.c:13 84 {*movti_internal}
 (nil))
(insn 30 250 273 2 (set (reg:TI 40 r11 [orig:98 _15 ] [98])
(reg:TI 3 bx [235])) pr69896.c:13 84 {*movti_internal}
 (nil))
(note 273 30 31 2 NOTE_INSN_DELETED)
(insn 31 273 32 2 (set (reg:SI 3 bx [orig:99 _16 ] [99])
(reg:SI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal}
 (nil))
...
(insn 271 37 278 2 (set (mem/c:TI (plus:DI (reg/f:DI 7 sp)
(const_int 32 [0x20])) [6 %sfp+-352 S16 A128])
(reg:TI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 84 {*movti_internal}
 (nil))
(uses of reg:SI 3 bx [orig:99 _16 ] [99]) were removed during *.postreload).
Then in *.split2 we get:
(insn 279 28 280 2 (set (reg:DI 3 bx [235])
(mem/j/c:DI (plus:DI (reg/f:DI 7 sp)
(const_int 704 [0x2c0])) [1 v32u128_1+0 S8 A256])) pr69896.c:13
85 {*movdi_internal}
 (nil))
(insn 280 279 281 2 (set (reg:DI 4 si [+8 ])
(mem/j/c:DI (plus:DI (reg/f:DI 7 sp)
(const_int 712 [0x2c8])) [1 v32u128_1+8 S8 A64])) pr69896.c:13
85 {*movdi_internal}
 (nil))
(insn 281 280 282 2 (set (reg:DI 40 r11 [orig:98 _15 ] [98])
(reg:DI 3 bx [235])) pr69896.c:13 85 {*movdi_internal}
 (nil))
(insn 282 281 273 2 (set (reg:DI 41 r12 [ _15+8 ])
(reg:DI 4 si [+8 ])) pr69896.c:13 85 {*movdi_internal}
 (nil))
(note 273 282 31 2 NOTE_INSN_DELETED)
(insn 31 273 32 2 (set (reg:SI 3 bx [orig:99 _16 ] [99])
(reg:SI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal}
 (nil)) 
...
(insn 283 37 284 2 (set (mem/c:DI (plus:DI (reg/f:DI 7 sp)
(const_int 32 [0x20])) [6 %sfp+-352 S8 A128])
(reg:DI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 85 {*movdi_internal}
 (nil))
(insn 284 283 278 2 (set (mem/c:DI (plus:DI (reg/f:DI 7 sp)
(const_int 40 [0x28])) [6 %sfp+-344 S8 A64])
(reg:DI 41 r12 [ _15+8 ])) pr69896.c:13 85 {*movdi_internal}
 (nil))

I believe it is the *.pro_and_epilogue pass that makes the invalid
transformation:
 (insn 31 273 32 2 (set (reg:SI 3 bx [orig:99 _16 ] [99])
-(reg:SI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal}
+(reg:SI 3 bx [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal}
  (nil))
is fine, but
 (insn 283 37 284 2 (set (mem/c:DI (plus:DI (reg/f:DI 7 sp)
 (const_int 32 [0x20])) [6 %sfp+-352 S8 A128])
-(reg:DI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 85 {*movdi_internal}
+(reg:DI 3 bx [orig:98 _15 ] [98])) pr69896.c:13 85 {*movdi_internal}
  (nil))
is wrong, unless insn 31 is removed or turned into DImode assignment instead of
SImode.
 (insn 284 283 278 2 (set (mem/c:DI (plus:DI (reg/f:DI 7 sp)
 (const_int 40 [0x28])) [6 %sfp+-344 S8 A64])
-(reg:DI 41 r12 [ _15+8 ])) pr69896.c:13 85 {*movdi_internal}
+(reg:DI 4 si [orig:41 _15+8 ] [41])) pr69896.c:13 85 {*movdi_internal}
  (nil))
 (note 278 284 251 2 NOTE_INSN_DELETED)
 (insn 251 278 38 2 (set (reg:SI 4 si [236])
-(reg:SI 40 r11 [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal}
+(reg:SI 3 bx [orig:98 _15 ] [98])) pr69896.c:13 86 {*movsi_internal}
  (nil))
 (insn 38 251 39 2 (set (mem/c:SI (plus:DI (reg/f:DI 7 sp)
 (const_int 88 [0x58])) [3  S4 A64])
-(reg:SI 4 si [236])) pr69896.c:13 86 {*movsi_internal}
+(reg:SI 3 bx [236])) pr69896.c:13 86 {*movsi_internal}
  (nil))
are all fine.  -fno-shrink-wrapping indeed fixes this, as the invalid
transformation is not performed.