[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 Jakub Jelinek changed: What|Removed |Added CC||sukhovvl at gmail dot com --- Comment #20 from Jakub Jelinek --- *** Bug 87646 has been marked as a duplicate of this bug. ***
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 Jakub Jelinek changed: What|Removed |Added Target Milestone|--- |4.9.4
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #19 from Zdenek Sojka --- Target milestone 4.9.4 is not set.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #18 from Jakub Jelinek --- Fixed.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #17 from Jakub Jelinek --- Author: jakub Date: Thu Jul 7 21:47:40 2016 New Revision: 238137 URL: https://gcc.gnu.org/viewcvs?rev=238137=gcc=rev Log: Backported from mainline 2016-02-26 Jakub JelinekEric Botcazou PR rtl-optimization/69891 * dse.c (scan_insn): If we can't figure out memset arguments or they are non-constant, call clear_rhs_from_active_local_stores. * gcc.target/i386/pr69891.c: New test. Added: branches/gcc-4_9-branch/gcc/testsuite/gcc.target/i386/pr69891.c Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/dse.c branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #16 from Jakub Jelinek --- Fixed for 5.4+ too.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #15 from Jakub Jelinek --- Author: jakub Date: Wed Mar 30 12:34:48 2016 New Revision: 234555 URL: https://gcc.gnu.org/viewcvs?rev=234555=gcc=rev Log: Backported from mainline 2016-02-26 Jakub JelinekEric Botcazou PR rtl-optimization/69891 * dse.c (scan_insn): If we can't figure out memset arguments or they are non-constant, call clear_rhs_from_active_local_stores. * gcc.target/i386/pr69891.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr69891.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/dse.c branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #14 from Jakub Jelinek --- Fixed on the trunk so far.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #13 from Jakub Jelinek --- Author: jakub Date: Fri Feb 26 15:53:43 2016 New Revision: 233743 URL: https://gcc.gnu.org/viewcvs?rev=233743=gcc=rev Log: PR rtl-optimization/69891 * dse.c (scan_insn): If we can't figure out memset arguments or they are non-constant, call clear_rhs_from_active_local_stores. * gcc.target/i386/pr69891.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr69891.c Modified: trunk/gcc/ChangeLog trunk/gcc/dse.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #12 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #11) > (In reply to Jakub Jelinek from comment #9) > > Created attachment 37801 [details] > > gcc6-pr69891.patch > > > > Full patch I'm going to bootstrap/regtest. > > +/* PR rtl-optimization/69891 */ > +/* { dg-do run } */ > +/* { dg-options "-O -fno-tree-fre -mstringop-strategy=libcall -Wno-psabi" } > */ > +/* { dg-additional-options "-mno-sse" { target ia32 } } */ > > -mstringop-strategy is x86 specific option. Oh ... the testcase *is* in gcc.target/i386/... directory
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #11 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #9) > Created attachment 37801 [details] > gcc6-pr69891.patch > > Full patch I'm going to bootstrap/regtest. +/* PR rtl-optimization/69891 */ +/* { dg-do run } */ +/* { dg-options "-O -fno-tree-fre -mstringop-strategy=libcall -Wno-psabi" } */ +/* { dg-additional-options "-mno-sse" { target ia32 } } */ -mstringop-strategy is x86 specific option.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 Eric Botcazou changed: What|Removed |Added Assignee|ebotcazou at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #10 from Eric Botcazou --- > I think the right fix is just: > --- gcc/dse.c.jj 2016-01-19 13:32:12.0 +0100 > +++ gcc/dse.c 2016-02-26 11:03:36.694206088 +0100 > @@ -2556,6 +2556,8 @@ scan_insn (bb_info_t bb_info, rtx_insn * > active_local_stores = insn_info; > } > } > + else > + clear_rhs_from_active_local_stores (); > } > } >else if (SIBLING_CALL_P (insn) && reload_completed) > > memset, even if we can't figure out its arguments, is never a read of > anything (except for the arguments, but those are already handled similarly > to const calls). clear_rhs_from_active_local_stores () is the punt action > in record_store, we store something, but we don't know where exactly. Yes, this looks more precise than reset_active_stores indeed. Assign back if you want me to do the testing.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #9 from Jakub Jelinek --- Created attachment 37801 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37801=edit gcc6-pr69891.patch Full patch I'm going to bootstrap/regtest.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #8 from Jakub Jelinek --- I think the right fix is just: --- gcc/dse.c.jj2016-01-19 13:32:12.0 +0100 +++ gcc/dse.c 2016-02-26 11:03:36.694206088 +0100 @@ -2556,6 +2556,8 @@ scan_insn (bb_info_t bb_info, rtx_insn * active_local_stores = insn_info; } } + else + clear_rhs_from_active_local_stores (); } } else if (SIBLING_CALL_P (insn) && reload_completed) memset, even if we can't figure out its arguments, is never a read of anything (except for the arguments, but those are already handled similarly to const calls). clear_rhs_from_active_local_stores () is the punt action in record_store, we store something, but we don't know where exactly.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 Eric Botcazou changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ebotcazou at gcc dot gnu.org --- Comment #7 from Eric Botcazou --- Testing a fix.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #6 from Eric Botcazou --- > But the memset could be also SIBLING_CALL_P. > Wouldn't it be better to change the else if to if, and add > if (const_call) return; > plus return to the end of mems_found == 1 then block? Then it would fall > through to arbitrary other call handling. That would unnecessarily add a wild read. So what about: Index: dse.c === --- dse.c (revision 233545) +++ dse.c (working copy) @@ -2528,6 +2528,10 @@ scan_insn (bb_info_t bb_info, rtx_insn * i_ptr = i_ptr->next_local_store; } + /* But a call to memset clobbers memory so invalidates stores. It's +not only an optimization issue (the previous stores may be dead) +but also a correctness issue since the previous stores cannot be +seen as the source of the current value of the locations. */ if (memset_call) { rtx args[3]; @@ -2556,6 +2560,8 @@ scan_insn (bb_info_t bb_info, rtx_insn * active_local_stores = insn_info; } } + else if (!SIBLING_CALL_P (insn)) + reset_active_stores (); } } else if (SIBLING_CALL_P (insn) && reload_completed)
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #5 from Jakub Jelinek --- (In reply to Eric Botcazou from comment #3) > The problematic store is based on argp so it isn't killed by the memset, > since only those based on sp or fp are: > > **scanning insn=20 > mem: (symbol_ref:SI ("memset") [flags 0x41] memset>) > >after canon_rtx address: (symbol_ref:SI ("memset") [flags 0x41] > ) > gid=1 offset=0 > processing const load gid=1[0..1) > removing from active insn=19 has store > removing from active insn=18 has store > removing from active insn=17 has store > memset call 20 > > It's apparently a small loophole in the PR middle-end/31150 enhancement. > > Index: dse.c > === > --- dse.c (revision 233545) > +++ dse.c (working copy) > @@ -2528,6 +2528,10 @@ scan_insn (bb_info_t bb_info, rtx_insn * > i_ptr = i_ptr->next_local_store; > } > > + /* But a call to memset clobbers memory so invalidates stores. > It's > +not only an optimization issue (the previous stores may be dead) > +but also a correctness issue since the previous stores cannot be > +seen as the source of the current value of the locations. */ > if (memset_call) > { > rtx args[3]; > @@ -2556,6 +2560,8 @@ scan_insn (bb_info_t bb_info, rtx_insn * > active_local_stores = insn_info; > } > } > + else > + add_non_frame_wild_read (bb_info); > } > } >else if (SIBLING_CALL_P (insn) && reload_completed) > > > Jakub, does this look good to you? But the memset could be also SIBLING_CALL_P. Wouldn't it be better to change the else if to if, and add if (const_call) return; plus return to the end of mems_found == 1 then block? Then it would fall through to arbitrary other call handling.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 --- Comment #4 from Eric Botcazou --- > It's apparently a small loophole in the PR middle-end/31150 enhancement. In fact this is sufficient: Index: dse.c === --- dse.c (revision 233545) +++ dse.c (working copy) @@ -2528,6 +2528,10 @@ scan_insn (bb_info_t bb_info, rtx_insn * i_ptr = i_ptr->next_local_store; } + /* But a call to memset clobbers memory so invalidates stores. It's +not only an optimization issue (the previous stores may be dead) +but also a correctness issue since the previous stores cannot be +seen as the source of the current value of the locations. */ if (memset_call) { rtx args[3]; @@ -2556,6 +2560,8 @@ scan_insn (bb_info_t bb_info, rtx_insn * active_local_stores = insn_info; } } + else + reset_active_stores (); } } else if (SIBLING_CALL_P (insn) && reload_completed)
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 Eric Botcazou changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Eric Botcazou --- The problematic store is based on argp so it isn't killed by the memset, since only those based on sp or fp are: **scanning insn=20 mem: (symbol_ref:SI ("memset") [flags 0x41] ) after canon_rtx address: (symbol_ref:SI ("memset") [flags 0x41] ) gid=1 offset=0 processing const load gid=1[0..1) removing from active insn=19 has store removing from active insn=18 has store removing from active insn=17 has store memset call 20 It's apparently a small loophole in the PR middle-end/31150 enhancement. Index: dse.c === --- dse.c (revision 233545) +++ dse.c (working copy) @@ -2528,6 +2528,10 @@ scan_insn (bb_info_t bb_info, rtx_insn * i_ptr = i_ptr->next_local_store; } + /* But a call to memset clobbers memory so invalidates stores. It's +not only an optimization issue (the previous stores may be dead) +but also a correctness issue since the previous stores cannot be +seen as the source of the current value of the locations. */ if (memset_call) { rtx args[3]; @@ -2556,6 +2560,8 @@ scan_insn (bb_info_t bb_info, rtx_insn * active_local_stores = insn_info; } } + else + add_non_frame_wild_read (bb_info); } } else if (SIBLING_CALL_P (insn) && reload_completed) Jakub, does this look good to you?
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 Uroš Bizjak changed: What|Removed |Added CC||ebotcazou at gcc dot gnu.org, ||law at redhat dot com --- Comment #2 from Uroš Bizjak --- Adding some CCs for RTL-opt problems.
[Bug rtl-optimization/69891] wrong code with -mstringop-strategy=libcall @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69891 Uroš Bizjak changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-02-21 CC||ubizjak at gmail dot com Component|target |rtl-optimization Ever confirmed|0 |1 --- Comment #1 from Uroš Bizjak --- (In reply to Zdenek Sojka from comment #0) > Reproduces with x86_64 compiler -m32 as well. (-mno-sse has to be added in case of x86_64 compiler with -m32). This is RTL aliasing issue. We start with following _optimized tree dump: : _2 = BIT_FIELD_REF; ... _9 = _2 | 7; BIT_FIELD_REF = _9; ... v32u32_1 = { 0, 0, 0, 0, 0, 0, 0, 0 }; ... _19 = BIT_FIELD_REF ; ... _27 = _19 + _22; ... which gets expanded to: ;; BIT_FIELD_REF = _9; (insn 7 6 8 (parallel [ (set (reg:SI 121) (ior:SI (reg:SI 87 [ _2 ]) (const_int 7 [0x7]))) (clobber (reg:CC 17 flags)) ]) pr69891.c:19 -1 (nil)) (insn 8 7 0 (set (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256]) (reg:SI 121)) pr69891.c:19 -1 (nil)) ... (insn 13 12 0 (set (reg:SI 119 [ _117 ]) (reg:SI 125)) pr69891.c:25 -1 (nil)) ;; v32u32_1 = { 0, 0, 0, 0, 0, 0, 0, 0 }; (insn 14 13 15 (parallel [ (set (reg:SI 127) (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 64 [0x40]))) (clobber (reg:CC 17 flags)) ]) pr69891.c:31 -1 (nil)) (insn 15 14 16 (set (reg:SI 128) (const_int 32 [0x20])) pr69891.c:31 -1 (nil)) (insn 16 15 17 (parallel [ (set (reg/f:SI 7 sp) (plus:SI (reg/f:SI 7 sp) (const_int -20 [0xffec]))) (clobber (reg:CC 17 flags)) ]) pr69891.c:31 -1 (expr_list:REG_ARGS_SIZE (const_int 20 [0x14]) (nil))) (insn 17 16 18 (set (mem:SI (pre_dec:SI (reg/f:SI 7 sp)) [2 S4 A32]) (reg:SI 128)) pr69891.c:31 -1 (expr_list:REG_ARGS_SIZE (const_int 24 [0x18]) (nil))) (insn 18 17 19 (set (mem:SI (pre_dec:SI (reg/f:SI 7 sp)) [2 S4 A32]) (const_int 0 [0])) pr69891.c:31 -1 (expr_list:REG_ARGS_SIZE (const_int 28 [0x1c]) (nil))) (insn 19 18 20 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [4 S4 A32]) (reg:SI 127)) pr69891.c:31 -1 (expr_list:REG_ARGS_SIZE (const_int 32 [0x20]) (nil))) (call_insn 20 19 21 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:SI ("memset") [flags 0x41] ) [0 memset S1 A8]) (const_int 32 [0x20]))) pr69891.c:31 -1 (expr_list:REG_EH_REGION (const_int 0 [0]) (nil)) (nil)) ... (insn 170 169 171 (set (reg:SI 202) (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256])) pr69891.c:37 -1 (nil)) (insn 171 170 172 (set (reg:SI 203) (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 120 [0x78])) [3 v32u64_1+24 S4 A64])) pr69891.c:37 -1 (nil)) (insn 172 171 173 (parallel [ (set (reg:SI 201) (plus:SI (reg:SI 202) (reg:SI 203))) (clobber (reg:CC 17 flags)) ]) pr69891.c:37 -1 (expr_list:REG_EQUAL (plus:SI (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 64 [0x40])) [2 v32u32_1+0 S4 A256]) (mem/j/c:SI (plus:SI (reg/f:SI 81 virtual-incoming-args) (const_int 120 [0x78])) [3 v32u64_1+24 S4 A64])) (nil))) However, DSE1 pass propagates r121 (aka r207) from (insn 7) all the way to the (insn 170), without considering aliasing memset in (insn 20). 5: r87:SI=[argp:SI+0x40] 6: {r89:HI=-r87:SI#0;clobber flags:CC;} REG_UNUSED flags:CC 7: {r121:SI=r87:SI|0x7;clobber flags:CC;} REG_DEAD r87:SI REG_UNUSED flags:CC 186: r207:SI=r121:SI 8: [argp:SI+0x40]=r121:SI ... 14: {r127:SI=argp:SI+0x40;clobber flags:CC;} REG_UNUSED flags:CC 16: {sp:SI=sp:SI-0x14;clobber flags:CC;} REG_UNUSED flags:CC REG_ARGS_SIZE 0x14 17: [--sp:SI]=0x20 REG_ARGS_SIZE 0x18 18: [--sp:SI]=0 REG_ARGS_SIZE 0x1c 19: [--sp:SI]=r127:SI REG_DEAD r127:SI REG_ARGS_SIZE 0x20 20: ax:SI=call [`memset'] argc:0x20 REG_UNUSED ax:SI REG_EH_REGION 0 ... 170: r202:SI=r207:SI REG_DEAD r207:SI 171: r203:SI=[argp:SI+0x78] 172: {r201:SI=r202:SI+r203:SI;clobber flags:CC;} REG_DEAD r203:SI REG_DEAD r202:SI REG_UNUSED flags:CC REG_EQUAL