Re: Improve DSE in the presence of calls
On Sat, Jun 18, 2011 at 9:03 PM, Easwaran Raman era...@google.com wrote: The gimple for test2_31 before RTL expansion contains: # .MEMD.2034_2 = VDEF .MEMD.2034_1(D) s1D.2035 = s1D.1255; The RHS is a PARM_DECL. It doesn't have TREE_ADDRESSABLE and the LHS has, which makes sense. But then the move is translated to memcpy but the src memory location is still considered not addressable. Setting TREE_ADDRESSABLE to RHS should fix this bug. I don't know which is the best place to that though. At the point when it decides to use memcpy to do the copying in emit_block_move_hints, we are dealing with temporaries and we don't have the src TREE lying around. Is passing the tree EXP to emit_block_move from store_expr a reasonable approach? Well, this is the kind of issues you can run into when gimple gets lowered to RTL (I can imagine structs passed by value on a callee-copy arch will fail similarly - they'd get passed by reference and you wouldn't see the callee-copy). Until now we didn't try to disambiguate anything against calls, so the issue pops up only now. But yes, eventually calling mark_addressable on the trees that get their address exposed to function calls is a good way to avoid this. Longer term we want to expose such details by lowering them early during gimple. Richard. -Easwaran On Fri, Jun 17, 2011 at 2:16 PM, Easwaran Raman era...@google.com wrote: This patch seems to break ia64 and some other targets. I have updated Bug 49429 with a test case triage. It looks like for some EXPR, may_be_aliased returns incorrect value and hence can_escape incorrectly concludes the variable can't escape resulting in removal of useful stores. (So can_escape() returns false. But later on, in the same BB, I see: In the following list of instructions, (insn 4 3 6 2 (set (mem/s/c:DI (reg/f:DI 341) [2 s1+0 S8 A64]) (reg:DI 112 in0)) y.c:23 5 {movdi_internal} (expr_list:REG_DEAD (reg:DI 112 in0) (nil))) ... (insn 36 30 37 2 (set (reg:DI 120 out0) (reg/f:DI 357)) 5 {movdi_internal} (expr_list:REG_EQUAL (plus:DI (reg/f:DI 328 sfp) (const_int 62 [0x3e])) (nil))) (insn 37 36 38 2 (set (reg:DI 121 out1) (reg/f:DI 341)) 5 {movdi_internal} (expr_list:REG_DEAD (reg/f:DI 341) (expr_list:REG_EQUAL (plus:DI (reg/f:DI 328 sfp) (const_int 96 [0x60])) (nil (insn 38 37 39 2 (set (reg:DI 122 out2) (const_int 31 [0x1f])) 5 {movdi_internal} (nil)) (call_insn 39 38 42 2 (parallel [ (set (reg:DI 8 r8) (call (mem:DI (symbol_ref:DI (memcpy) [flags 0x41] function_decl 0x770d2e00 memcpy) [0 memcpy S8 A64]) (const_int 1 [0x1]))) (clobber (reg:DI 320 b0)) (clobber (scratch:DI)) (clobber (scratch:DI)) ]) 332 {call_value_gp} (expr_list:REG_DEAD (reg:DI 122 out2) (expr_list:REG_DEAD (reg:DI 121 out1) (expr_list:REG_DEAD (reg:DI 120 out0) (expr_list:REG_UNUSED (reg:DI 8 r8) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil)) (expr_list:REG_DEP_TRUE (use (reg:DI 1 r1)) (expr_list:REG_DEP_TRUE (use (reg:DI 122 out2)) (expr_list:REG_DEP_TRUE (use (reg:DI 121 out1)) (expr_list:REG_DEP_TRUE (use (reg:DI 120 out0)) (nil)) for the memory expression (set (mem/s/c:DI (reg/f:DI 341) [2 s1+0 S8 A64]) (reg:DI 112 in0)) may_be_aliased() returns false, but register 341 is passed as the second parameter to memcpy. I suspect this is a bug elsewhere which is exposed by this patch. If someone knows why this might be happening, I can tighten the can_escape() function appropriately. Thanks, Easwaran On Tue, Jun 14, 2011 at 9:34 AM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/10/11 13:18, Easwaran Raman wrote: I am not sure I understand the problem here. If there is a wild read from asm, the instruction has the wild_read flag set. The if statement checks if that flag is set and if so it clears the bitmap - which was the original behavior. Originally, only if read_rec is non NULL you need to recompute the kill set. Now, even if read_rec is NULL, non_frame_wild_read could be set requiring the kill set to be modified, which is what this patch does. In fact, isn't what you have written above the equivalent to what is in the patch as '/* Leave this clause unchanged */' is the same as if (dump_file) fprintf (dump_file, regular read\n); scan_reads_nospill (insn_info, v, NULL); -Easwaran Ping. I have changed the test case to use int and added another test case that shows DSE doesn't happen when the struct instance is volatile (wild_read gets set in that case) What's the purpose behind using unit64_t in the testcase? Somehow I
Re: Improve DSE in the presence of calls
This seems to have triggered http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 on arm-linux-gnueabi Might well be a DUP of PR49429 but I haven't looked at it yet. cheers Ramana
Re: Improve DSE in the presence of calls
This patch seems to break ia64 and some other targets. I have updated Bug 49429 with a test case triage. It looks like for some EXPR, may_be_aliased returns incorrect value and hence can_escape incorrectly concludes the variable can't escape resulting in removal of useful stores. (So can_escape() returns false. But later on, in the same BB, I see: In the following list of instructions, (insn 4 3 6 2 (set (mem/s/c:DI (reg/f:DI 341) [2 s1+0 S8 A64]) (reg:DI 112 in0)) y.c:23 5 {movdi_internal} (expr_list:REG_DEAD (reg:DI 112 in0) (nil))) ... (insn 36 30 37 2 (set (reg:DI 120 out0) (reg/f:DI 357)) 5 {movdi_internal} (expr_list:REG_EQUAL (plus:DI (reg/f:DI 328 sfp) (const_int 62 [0x3e])) (nil))) (insn 37 36 38 2 (set (reg:DI 121 out1) (reg/f:DI 341)) 5 {movdi_internal} (expr_list:REG_DEAD (reg/f:DI 341) (expr_list:REG_EQUAL (plus:DI (reg/f:DI 328 sfp) (const_int 96 [0x60])) (nil (insn 38 37 39 2 (set (reg:DI 122 out2) (const_int 31 [0x1f])) 5 {movdi_internal} (nil)) (call_insn 39 38 42 2 (parallel [ (set (reg:DI 8 r8) (call (mem:DI (symbol_ref:DI (memcpy) [flags 0x41] function_decl 0x770d2e00 memcpy) [0 memcpy S8 A64]) (const_int 1 [0x1]))) (clobber (reg:DI 320 b0)) (clobber (scratch:DI)) (clobber (scratch:DI)) ]) 332 {call_value_gp} (expr_list:REG_DEAD (reg:DI 122 out2) (expr_list:REG_DEAD (reg:DI 121 out1) (expr_list:REG_DEAD (reg:DI 120 out0) (expr_list:REG_UNUSED (reg:DI 8 r8) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil)) (expr_list:REG_DEP_TRUE (use (reg:DI 1 r1)) (expr_list:REG_DEP_TRUE (use (reg:DI 122 out2)) (expr_list:REG_DEP_TRUE (use (reg:DI 121 out1)) (expr_list:REG_DEP_TRUE (use (reg:DI 120 out0)) (nil)) for the memory expression (set (mem/s/c:DI (reg/f:DI 341) [2 s1+0 S8 A64]) (reg:DI 112 in0)) may_be_aliased() returns false, but register 341 is passed as the second parameter to memcpy. I suspect this is a bug elsewhere which is exposed by this patch. If someone knows why this might be happening, I can tighten the can_escape() function appropriately. Thanks, Easwaran On Tue, Jun 14, 2011 at 9:34 AM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/10/11 13:18, Easwaran Raman wrote: I am not sure I understand the problem here. If there is a wild read from asm, the instruction has the wild_read flag set. The if statement checks if that flag is set and if so it clears the bitmap - which was the original behavior. Originally, only if read_rec is non NULL you need to recompute the kill set. Now, even if read_rec is NULL, non_frame_wild_read could be set requiring the kill set to be modified, which is what this patch does. In fact, isn't what you have written above the equivalent to what is in the patch as '/* Leave this clause unchanged */' is the same as if (dump_file) fprintf (dump_file, regular read\n); scan_reads_nospill (insn_info, v, NULL); -Easwaran Ping. I have changed the test case to use int and added another test case that shows DSE doesn't happen when the struct instance is volatile (wild_read gets set in that case) What's the purpose behind using unit64_t in the testcase? Somehow I suspect using int64_t means the test is unlikely not going to work across targets with different word sizes. Sorry for the exceedingly long wait. Things have been a bit crazy the last several weeks. On a positive note, re-reading things now I think my objection/comment was mis-guided. Patch approved, and again, sorry for the absurdly long period of non-responsiveness. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJN942IAAoJEBRtltQi2kC7Fj4IAIUvXsEKHZEKHS2k/psJWyaM Uo/vW3CLydRP0+Np/VVSwzHlmWfdWmOj1WPw1Svhvr4gP8BrZ13okVv5jbw1Hh3Y R4mShXFK5eYzmGx5wL54hOze5zViN3gomNGbDAAhk6TCzNXmPyLT/6V1tLFTNhD5 6zOiW8pXh9ik6qTTCKbG0EMuJXDnIbYrJs4d/gHFerUgmRPc8adKjF3PCngD3F4r 40n9W/UxUejYUddavDW1fIdALWYc56F3glplFsII7SMnOmih8MTFYOvk6SZsLS5O G2nzmnUuwt6tPWTyk9bpVKQi5dn8MmLkM13w22t36GKIg6OER2KfUdv44dgE7yw= =o7AI -END PGP SIGNATURE-
Re: Improve DSE in the presence of calls
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/10/11 13:18, Easwaran Raman wrote: I am not sure I understand the problem here. If there is a wild read from asm, the instruction has the wild_read flag set. The if statement checks if that flag is set and if so it clears the bitmap - which was the original behavior. Originally, only if read_rec is non NULL you need to recompute the kill set. Now, even if read_rec is NULL, non_frame_wild_read could be set requiring the kill set to be modified, which is what this patch does. In fact, isn't what you have written above the equivalent to what is in the patch as '/* Leave this clause unchanged */' is the same as if (dump_file) fprintf (dump_file, regular read\n); scan_reads_nospill (insn_info, v, NULL); -Easwaran Ping. I have changed the test case to use int and added another test case that shows DSE doesn't happen when the struct instance is volatile (wild_read gets set in that case) What's the purpose behind using unit64_t in the testcase? Somehow I suspect using int64_t means the test is unlikely not going to work across targets with different word sizes. Sorry for the exceedingly long wait. Things have been a bit crazy the last several weeks. On a positive note, re-reading things now I think my objection/comment was mis-guided. Patch approved, and again, sorry for the absurdly long period of non-responsiveness. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJN942IAAoJEBRtltQi2kC7Fj4IAIUvXsEKHZEKHS2k/psJWyaM Uo/vW3CLydRP0+Np/VVSwzHlmWfdWmOj1WPw1Svhvr4gP8BrZ13okVv5jbw1Hh3Y R4mShXFK5eYzmGx5wL54hOze5zViN3gomNGbDAAhk6TCzNXmPyLT/6V1tLFTNhD5 6zOiW8pXh9ik6qTTCKbG0EMuJXDnIbYrJs4d/gHFerUgmRPc8adKjF3PCngD3F4r 40n9W/UxUejYUddavDW1fIdALWYc56F3glplFsII7SMnOmih8MTFYOvk6SZsLS5O G2nzmnUuwt6tPWTyk9bpVKQi5dn8MmLkM13w22t36GKIg6OER2KfUdv44dgE7yw= =o7AI -END PGP SIGNATURE-
Re: Ping: Re: Improve DSE in the presence of calls
Ping. Diego, David, Is this patch OK for google/main? -Easwaran On Thu, Jun 2, 2011 at 4:48 PM, Easwaran Raman era...@google.com wrote: Ping. On Sat, May 14, 2011 at 8:01 AM, Easwaran Raman era...@google.com wrote: http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00781.html
Re: Ping: Re: Improve DSE in the presence of calls
Ping. On Sat, May 14, 2011 at 8:01 AM, Easwaran Raman era...@google.com wrote: http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00781.html
Re: Ping: Re: Improve DSE in the presence of calls
Ping. On Sat, May 14, 2011 at 8:01 AM, Easwaran Raman era...@google.com wrote: http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00781.html
Ping: Re: Improve DSE in the presence of calls
http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00781.html
Re: Improve DSE in the presence of calls
On Tue, May 3, 2011 at 9:40 AM, Easwaran Raman era...@google.com wrote: On Mon, May 2, 2011 at 8:37 PM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/26/11 16:06, Easwaran Raman wrote: You're right. The patch has correctness issues. It is not possible to simply not call add_wild_read because it also resets active_local_stores and frees read_recs. During the local phase it seems better to just treat calls as wild reads and reset active_local_stores and free read_recs. I've now refactored add_wild_read so that resetting active_local_stores and free read_recs are in separate methods that can be called on non-const/non-memset calls. In addition, I have added a non_frame_wild_read field in insn_info to mark non-const and non-memset calls. I've attached the revised patch. Looking better. Just a few more things. Don't all locals escape if the callee has a static chain? Is that handled anywhere? Don't you still have the potential for wild reads in dse_step5_nospill (say from an asm)? if so, then the change can't be correct. Couldn't you just add a clause like this before the else-if? else if (insn_info-non_frame_wild_read) { if (dump_file) fprintf (dump_file, non-frame wild read\n); scan_reads_nospill (insn_info, v, NULL); } else if (insn_info-read_rec) { /* Leave this clause unchanged */ } Am I missing something? I am not sure I understand the problem here. If there is a wild read from asm, the instruction has the wild_read flag set. The if statement checks if that flag is set and if so it clears the bitmap - which was the original behavior. Originally, only if read_rec is non NULL you need to recompute the kill set. Now, even if read_rec is NULL, non_frame_wild_read could be set requiring the kill set to be modified, which is what this patch does. In fact, isn't what you have written above the equivalent to what is in the patch as '/* Leave this clause unchanged */' is the same as if (dump_file) fprintf (dump_file, regular read\n); scan_reads_nospill (insn_info, v, NULL); -Easwaran Ping. I have changed the test case to use int and added another test case that shows DSE doesn't happen when the struct instance is volatile (wild_read gets set in that case) What's the purpose behind using unit64_t in the testcase? Somehow I suspect using int64_t means the test is unlikely not going to work across targets with different word sizes. Jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNv3iGAAoJEBRtltQi2kC7C8oH+gPi/6DvV2JkmitT3xEeUaW8 szOHC0GysQkot/TiVlQQy6fiz61G49ii0mz68W4cfSntvuN7iaqBqVWpRKWIzvkx alk4U1snj9a2t9+9ZRTX4xm3quggTv+mUDK81a+DIS2zAf6i/HRXLvQbx6fhDOYD sqXqSkvCKqkH2pPHHYEqnBtS/cLFtAfZJe+JSx3u2oqL+sBFGLftdoV6yJzkShLS LpmYHMeDbzhdCtZTf15GQm3h/cBlyrChxjQsjJxLiXk5rrcDI/X+Pqi+cll21Gwg mlzMBi0iYToENl+7aO8DNGvXfliCEzQ7iEUyTctJiTDt3/RgVcaxgRJgqHZNSBI= =VOdm -END PGP SIGNATURE- Index: gcc/testsuite/gcc.dg/pr44194-1.c === --- gcc/testsuite/gcc.dg/pr44194-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr44194-1.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-rtl-dse1 } */ +#include stdint.h + +struct ints { int a, b, c; } foo(); +void bar(int a, int b); + +void func() { + struct ints s = foo(); + bar(s.a, s.b); +} +/* { dg-final { scan-rtl-dump global deletions = 2 dse1 } } */ +/* { dg-final { cleanup-rtl-dump dse1 } } */ Index: gcc/testsuite/gcc.dg/pr44194-2.c === --- gcc/testsuite/gcc.dg/pr44194-2.c (revision 0) +++ gcc/testsuite/gcc.dg/pr44194-2.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-rtl-dse1 } */ +#include stdint.h + +struct ints { int a, b, c; } foo(); +void bar(int a, int b); + +void func() { + volatile struct ints s = foo(); + bar(s.a, s.b); +} +/* { dg-final { scan-rtl-dump global deletions = 0 dse1 } } */ +/* { dg-final { cleanup-rtl-dump dse1 } } */ Index: gcc/dse.c === --- gcc/dse.c (revision 172844) +++ gcc/dse.c (working copy) @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see #include dbgcnt.h #include target.h #include params.h +#include tree-flow.h /* This file contains three techniques for performing Dead Store Elimination (dse). @@ -326,6 +327,11 @@ struct insn_info contains a wild read, the use_rec will be null. */ bool wild_read; + /* This is true only for CALL instructions which could potentially read + any non-frame memory location. This field is used by the global + algorithm. */ + bool non_frame_wild_read; + /* This field is only used for the processing of const functions.
Re: Improve DSE in the presence of calls
On Tue, May 3, 2011 at 5:37 AM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/26/11 16:06, Easwaran Raman wrote: You're right. The patch has correctness issues. It is not possible to simply not call add_wild_read because it also resets active_local_stores and frees read_recs. During the local phase it seems better to just treat calls as wild reads and reset active_local_stores and free read_recs. I've now refactored add_wild_read so that resetting active_local_stores and free read_recs are in separate methods that can be called on non-const/non-memset calls. In addition, I have added a non_frame_wild_read field in insn_info to mark non-const and non-memset calls. I've attached the revised patch. Looking better. Just a few more things. Don't all locals escape if the callee has a static chain? Is that handled anywhere? Locals that escape this way get pushed into a struct variable which then has its address taken and passed to the call. This makes it aliased. So yes, that's automagically handled. Richard. Don't you still have the potential for wild reads in dse_step5_nospill (say from an asm)? if so, then the change can't be correct. Couldn't you just add a clause like this before the else-if? else if (insn_info-non_frame_wild_read) { if (dump_file) fprintf (dump_file, non-frame wild read\n); scan_reads_nospill (insn_info, v, NULL); } else if (insn_info-read_rec) { /* Leave this clause unchanged */ } Am I missing something? What's the purpose behind using unit64_t in the testcase? Somehow I suspect using int64_t means the test is unlikely not going to work across targets with different word sizes. Jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNv3iGAAoJEBRtltQi2kC7C8oH+gPi/6DvV2JkmitT3xEeUaW8 szOHC0GysQkot/TiVlQQy6fiz61G49ii0mz68W4cfSntvuN7iaqBqVWpRKWIzvkx alk4U1snj9a2t9+9ZRTX4xm3quggTv+mUDK81a+DIS2zAf6i/HRXLvQbx6fhDOYD sqXqSkvCKqkH2pPHHYEqnBtS/cLFtAfZJe+JSx3u2oqL+sBFGLftdoV6yJzkShLS LpmYHMeDbzhdCtZTf15GQm3h/cBlyrChxjQsjJxLiXk5rrcDI/X+Pqi+cll21Gwg mlzMBi0iYToENl+7aO8DNGvXfliCEzQ7iEUyTctJiTDt3/RgVcaxgRJgqHZNSBI= =VOdm -END PGP SIGNATURE-
Re: Improve DSE in the presence of calls
On Mon, May 2, 2011 at 8:37 PM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/26/11 16:06, Easwaran Raman wrote: You're right. The patch has correctness issues. It is not possible to simply not call add_wild_read because it also resets active_local_stores and frees read_recs. During the local phase it seems better to just treat calls as wild reads and reset active_local_stores and free read_recs. I've now refactored add_wild_read so that resetting active_local_stores and free read_recs are in separate methods that can be called on non-const/non-memset calls. In addition, I have added a non_frame_wild_read field in insn_info to mark non-const and non-memset calls. I've attached the revised patch. Looking better. Just a few more things. Don't all locals escape if the callee has a static chain? Is that handled anywhere? Don't you still have the potential for wild reads in dse_step5_nospill (say from an asm)? if so, then the change can't be correct. Couldn't you just add a clause like this before the else-if? else if (insn_info-non_frame_wild_read) { if (dump_file) fprintf (dump_file, non-frame wild read\n); scan_reads_nospill (insn_info, v, NULL); } else if (insn_info-read_rec) { /* Leave this clause unchanged */ } Am I missing something? I am not sure I understand the problem here. If there is a wild read from asm, the instruction has the wild_read flag set. The if statement checks if that flag is set and if so it clears the bitmap - which was the original behavior. Originally, only if read_rec is non NULL you need to recompute the kill set. Now, even if read_rec is NULL, non_frame_wild_read could be set requiring the kill set to be modified, which is what this patch does. In fact, isn't what you have written above the equivalent to what is in the patch as '/* Leave this clause unchanged */' is the same as if (dump_file) fprintf (dump_file, regular read\n); scan_reads_nospill (insn_info, v, NULL); -Easwaran What's the purpose behind using unit64_t in the testcase? Somehow I suspect using int64_t means the test is unlikely not going to work across targets with different word sizes. Jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNv3iGAAoJEBRtltQi2kC7C8oH+gPi/6DvV2JkmitT3xEeUaW8 szOHC0GysQkot/TiVlQQy6fiz61G49ii0mz68W4cfSntvuN7iaqBqVWpRKWIzvkx alk4U1snj9a2t9+9ZRTX4xm3quggTv+mUDK81a+DIS2zAf6i/HRXLvQbx6fhDOYD sqXqSkvCKqkH2pPHHYEqnBtS/cLFtAfZJe+JSx3u2oqL+sBFGLftdoV6yJzkShLS LpmYHMeDbzhdCtZTf15GQm3h/cBlyrChxjQsjJxLiXk5rrcDI/X+Pqi+cll21Gwg mlzMBi0iYToENl+7aO8DNGvXfliCEzQ7iEUyTctJiTDt3/RgVcaxgRJgqHZNSBI= =VOdm -END PGP SIGNATURE-
Re: Improve DSE in the presence of calls
On Mon, Apr 25, 2011 at 10:03 AM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/22/11 14:19, Easwaran Raman wrote: Hi, This patch improves RTL DSE by not assuming that calls read all memory locations. With this patch, calls are assumed to read any non-frame memory and any stack variables that can potentially escape. This patch partly addresses PR rtl-optimization/44194. Bootstraps and no test regressions. OK for trunk? Thanks, Easwaran 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 * dse.c (header files): Include tree-flow.h. (group_info): Add fields. (globals): Add a new variable kill_on_calls. (get_group_info): Initialize added fields. (dse_step0): Initialize kill_on_calls. (can_escape): New function. (record_store): Pass EXPR corresponding to MEM to set_usage_bits. (dse_step2_nospill): Set kill_on_calls based on group-escaped_*. (scan_reads_nospill): Handle call instructions. (find_insn_before_first_wild_read): Remove the function. (dse_step3_scan): Remove call to find_insn_before_first_wild_read. (dse_step5_nospill): Do not kill everything on call. testsuite/ChangeLog: 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 * gcc.dg/pr44194-1.c: New test. On a bookkeeping note, it doesn't appear that you ever release the bitmaps for gi-escaped_{p,n} or kill_on_calls. This should probably be done in dse_step7. I've now freed the bitmaps in dse_step7. I'm going to assume that you and Richi have agreed upon the form of can_escape. Yes, Richard suggested to use may_be_aliased for now to check for escaping. AFAICT, even after your patch we still have a wild read recorded for most CALL_INSNs (see this code in scan_insn which was not modified): else /* Every other call, including pure functions, may read memory. */ add_wild_read (bb_info); It appears that you effectively ignore the wild_read in scan_reads_nospill by special casing CALL_INSNs. Correct? Is so, does it make still sense to call add_wild_read in scan_insn? ISTM that wild reads are still possible due to a few other constructs such as asms, address canonicalization failure, etc, so why is the removal of the wild read support in dse_step5_nospill correct? You're right. The patch has correctness issues. It is not possible to simply not call add_wild_read because it also resets active_local_stores and frees read_recs. During the local phase it seems better to just treat calls as wild reads and reset active_local_stores and free read_recs. I've now refactored add_wild_read so that resetting active_local_stores and free read_recs are in separate methods that can be called on non-const/non-memset calls. In addition, I have added a non_frame_wild_read field in insn_info to mark non-const and non-memset calls. I've attached the revised patch. -Easwaran Just to be clear, I generally like the change, but there's some details that I think need to be better understood before installing. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNtalUAAoJEBRtltQi2kC7evMH/0UyasIBTcZM7MSeCDGvefdR byd5/XaaH1FtdRS64ne1bGbjB/h96AJbXYfc3QECtKugfmwgSbJeaN/BD8T4WY/L 2z1MkeSwcpswnxufjTR6G5WezuUsxagB4/xoSqk7NRfJdsM3eBR9n+ehTwlfsU5q bTvQ/Uat0x027ddyrifD6vIOIIqFwMba9CvvN7vV0O+yzuxjzxNfe4IEcyMg2RIZ j6xK+Bv7pge9pHC8ERKOFO17CPdK2JBe4ovtFL8s1sadBkjO2044uRszcl8E/9rj MUGHExtFEIypzzi0wBqWyy3fz5QRCBuN/Xj6ptk4Cw7dkuBTZPWjySWrXV5kEs4= =omzC -END PGP SIGNATURE- 2011-04-26 Easwaran Raman era...@google.com * dse.c: Include tree-flow.h (insn_info): Add new field non_frame_wild_read. (group_info): Add new fields escaped_n and escaped_p. (kill_on_calls): New variable. (get_group_info): Initialize gi-escaped_n and gi-escaped_p. (dse_step0): Initialize kill_on_calls. (can_escape): New function. (set_usage_bits): Add additional parameter; record information about escaped locations. (record_store): Pass EXPR corresponding to MEM to set_usage_bits. (dse_step2_nospill): Set kill_on_calls based on group-escaped_n and group-escaped_n. (add_wild_read): Refactor into... (reset_active_stores): ... New method, and (free_read_records): ... New method. (add_non_frame_wild_read): New method. (scan_insn): Call add_non_frame_wild_read on non-const calls. (scan_reads_nospill): Handle instructions with non_frame_wild_read. (dse_step5_nospill): Call scan_reads_nospill for instructions marked as non_frame_wild_read. (dse_step7): Free escaped_n, escaped_p and kill_on_calls bitmaps. testsuite/ChangeLog 2011-04-26 Easwaran Raman era...@google.com
Re: Improve DSE in the presence of calls
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/22/11 14:19, Easwaran Raman wrote: Hi, This patch improves RTL DSE by not assuming that calls read all memory locations. With this patch, calls are assumed to read any non-frame memory and any stack variables that can potentially escape. This patch partly addresses PR rtl-optimization/44194. Bootstraps and no test regressions. OK for trunk? Thanks, Easwaran 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 * dse.c (header files): Include tree-flow.h. (group_info): Add fields. (globals): Add a new variable kill_on_calls. (get_group_info): Initialize added fields. (dse_step0): Initialize kill_on_calls. (can_escape): New function. (record_store): Pass EXPR corresponding to MEM to set_usage_bits. (dse_step2_nospill): Set kill_on_calls based on group-escaped_*. (scan_reads_nospill): Handle call instructions. (find_insn_before_first_wild_read): Remove the function. (dse_step3_scan): Remove call to find_insn_before_first_wild_read. (dse_step5_nospill): Do not kill everything on call. testsuite/ChangeLog: 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 * gcc.dg/pr44194-1.c: New test. On a bookkeeping note, it doesn't appear that you ever release the bitmaps for gi-escaped_{p,n} or kill_on_calls. This should probably be done in dse_step7. I'm going to assume that you and Richi have agreed upon the form of can_escape. AFAICT, even after your patch we still have a wild read recorded for most CALL_INSNs (see this code in scan_insn which was not modified): else /* Every other call, including pure functions, may read memory. */ add_wild_read (bb_info); It appears that you effectively ignore the wild_read in scan_reads_nospill by special casing CALL_INSNs. Correct? Is so, does it make still sense to call add_wild_read in scan_insn? ISTM that wild reads are still possible due to a few other constructs such as asms, address canonicalization failure, etc, so why is the removal of the wild read support in dse_step5_nospill correct? Just to be clear, I generally like the change, but there's some details that I think need to be better understood before installing. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNtalUAAoJEBRtltQi2kC7evMH/0UyasIBTcZM7MSeCDGvefdR byd5/XaaH1FtdRS64ne1bGbjB/h96AJbXYfc3QECtKugfmwgSbJeaN/BD8T4WY/L 2z1MkeSwcpswnxufjTR6G5WezuUsxagB4/xoSqk7NRfJdsM3eBR9n+ehTwlfsU5q bTvQ/Uat0x027ddyrifD6vIOIIqFwMba9CvvN7vV0O+yzuxjzxNfe4IEcyMg2RIZ j6xK+Bv7pge9pHC8ERKOFO17CPdK2JBe4ovtFL8s1sadBkjO2044uRszcl8E/9rj MUGHExtFEIypzzi0wBqWyy3fz5QRCBuN/Xj6ptk4Cw7dkuBTZPWjySWrXV5kEs4= =omzC -END PGP SIGNATURE-
Improve DSE in the presence of calls
Hi, This patch improves RTL DSE by not assuming that calls read all memory locations. With this patch, calls are assumed to read any non-frame memory and any stack variables that can potentially escape. This patch partly addresses PR rtl-optimization/44194. Bootstraps and no test regressions. OK for trunk? Thanks, Easwaran 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 * dse.c (header files): Include tree-flow.h. (group_info): Add fields. (globals): Add a new variable kill_on_calls. (get_group_info): Initialize added fields. (dse_step0): Initialize kill_on_calls. (can_escape): New function. (record_store): Pass EXPR corresponding to MEM to set_usage_bits. (dse_step2_nospill): Set kill_on_calls based on group-escaped_*. (scan_reads_nospill): Handle call instructions. (find_insn_before_first_wild_read): Remove the function. (dse_step3_scan): Remove call to find_insn_before_first_wild_read. (dse_step5_nospill): Do not kill everything on call. testsuite/ChangeLog: 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 * gcc.dg/pr44194-1.c: New test. Index: gcc/testsuite/gcc.dg/pr44194-1.c === --- gcc/testsuite/gcc.dg/pr44194-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr44194-1.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-rtl-dse1 } */ +#include stdint.h + +struct twoints { uint64_t a, b; } foo(); +void bar(uint64_t a, uint64_t b); + +void func() { + struct twoints s = foo(); + bar(s.a, s.b); +} +/* { dg-final { scan-rtl-dump global deletions = 2 dse1 } } */ +/* { dg-final { cleanup-rtl-dump dse1 } } */ Index: gcc/dse.c === --- gcc/dse.c (revision 172844) +++ gcc/dse.c (working copy) @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see #include dbgcnt.h #include target.h #include params.h +#include tree-flow.h /* This file contains three techniques for performing Dead Store Elimination (dse). @@ -501,6 +502,11 @@ struct group_info deleted. */ bitmap store1_n, store1_p, store2_n, store2_p; + /* These bitmaps keep track of offsets in this group escape this function. + An offset escapes if it corresponds to a named variable whose + addressable flag is set. */ + bitmap escaped_n, escaped_p; + /* The positions in this bitmap have the same assignments as the in, out, gen and kill bitmaps. This bitmap is all zeros except for the positions that are occupied by stores for this group. */ @@ -588,6 +594,9 @@ static int spill_deleted; static bitmap all_blocks; +/* Locations that are killed by calls in the global phase. */ +static bitmap kill_on_calls; + /* The number of bits used in the global bitmaps. */ static unsigned int current_position; @@ -692,6 +701,8 @@ get_group_info (rtx base) gi-store1_p = BITMAP_ALLOC (NULL); gi-store2_n = BITMAP_ALLOC (NULL); gi-store2_p = BITMAP_ALLOC (NULL); + gi-escaped_p = BITMAP_ALLOC (NULL); + gi-escaped_n = BITMAP_ALLOC (NULL); gi-group_kill = BITMAP_ALLOC (NULL); gi-process_globally = false; gi-offset_map_size_n = 0; @@ -714,6 +725,8 @@ get_group_info (rtx base) gi-store1_p = BITMAP_ALLOC (NULL); gi-store2_n = BITMAP_ALLOC (NULL); gi-store2_p = BITMAP_ALLOC (NULL); + gi-escaped_p = BITMAP_ALLOC (NULL); + gi-escaped_n = BITMAP_ALLOC (NULL); gi-group_kill = BITMAP_ALLOC (NULL); gi-process_globally = false; gi-frame_related = @@ -739,6 +752,7 @@ dse_step0 (void) spill_deleted = 0; scratch = BITMAP_ALLOC (NULL); + kill_on_calls = BITMAP_ALLOC (NULL); rtx_store_info_pool = create_alloc_pool (rtx_store_info_pool, @@ -881,31 +895,48 @@ delete_dead_store_insn (insn_info_t insn_info) insn_info-wild_read = false; } +/* Check if EXPR can possibly escape the current function scope. */ +static bool +can_escape (tree expr) +{ + tree base; + if (!expr) +return true; + base = get_base_address (expr); + if (DECL_P (base) + !may_be_aliased (base)) +return false; + return true; +} /* Set the store* bitmaps offset_map_size* fields in GROUP based on OFFSET and WIDTH. */ static void -set_usage_bits (group_info_t group, HOST_WIDE_INT offset, HOST_WIDE_INT width) +set_usage_bits (group_info_t group, HOST_WIDE_INT offset, HOST_WIDE_INT width, +tree expr) { HOST_WIDE_INT i; - + bool expr_escapes = can_escape (expr); if (offset -MAX_OFFSET offset + width MAX_OFFSET) for (i=offset; ioffset+width; i++) { bitmap store1; bitmap store2; +bitmap escaped; int ai; if (i 0) { store1 = group-store1_n; store2 = group-store2_n; + escaped = group-escaped_n; ai = -i; } else { store1 =
Re: Improve DSE in the presence of calls
On Fri, Apr 22, 2011 at 01:19:17PM -0700, Easwaran Raman wrote: The ChangeLog entry has various issues: 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 This should have tab before PR as well. * dse.c (header files): Include tree-flow.h. This should be just * dse.c: Include tree-flow.h. (group_info): Add fields. (struct group_info): Add escaped_n and escaped_p fields. (it is always better to be be explicit on what you added). (globals): Add a new variable kill_on_calls. (kill_on_calls): New variable. (get_group_info): Initialize added fields. Instead of added fields again mention which fields were added. 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 Again, tab before PR. * gcc.dg/pr44194-1.c: New test. Jakub
Re: Improve DSE in the presence of calls
On Fri, Apr 22, 2011 at 1:26 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Apr 22, 2011 at 01:19:17PM -0700, Easwaran Raman wrote: The ChangeLog entry has various issues: 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 This should have tab before PR as well. * dse.c (header files): Include tree-flow.h. This should be just * dse.c: Include tree-flow.h. (group_info): Add fields. (struct group_info): Add escaped_n and escaped_p fields. (it is always better to be be explicit on what you added). (globals): Add a new variable kill_on_calls. (kill_on_calls): New variable. (get_group_info): Initialize added fields. Instead of added fields again mention which fields were added. 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 Again, tab before PR. * gcc.dg/pr44194-1.c: New test. Jakub Thanks for the comments. The fixed ChangeLog given below. -Easwaran 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 * dse.c: Include tree-flow.h. (group_info): Add escaped_n and escaped_p fields. (kill_on_calls): New variable. (get_group_info): Initialize gi-escaped_n and gi-escaped_p. (dse_step0): Initialize kill_on_calls. (can_escape): New function. (record_store): Pass EXPR corresponding to MEM to set_usage_bits. (dse_step2_nospill): Set kill_on_calls based on group-escaped_n and group-escaped_n. (scan_reads_nospill): Handle call instructions. (find_insn_before_first_wild_read): Remove the function. (dse_step3_scan): Remove call to find_insn_before_first_wild_read. (dse_step5_nospill): Do not kill everything on call. testsuite/ChangeLog: 2011-04-22 Easwaran Raman era...@google.com PR rtl-optimization/44194 * gcc.dg/pr44194-1.c: New test.