[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #23 from Carrot carrot at google dot com 2011-09-16 06:57:15 UTC --- (In reply to comment #21) All callee saved registers should never changed after function call. Here fp has been changed is not because it is after a function call, it is because it is after the target of non local goto. I'm not familiar with the implementation of non local goto, but I guess there is some convention/protocol defines which registers may be changed after the target of a non local goto. That's not the problem. The problem is that the blockage isn't honored. It seems postreload.c should be changed to the following to avoid combining --- postreload.c(revision 178904) +++ postreload.c(working copy) @@ -1312,7 +1312,7 @@ reload_combine (void) is and then later disable any optimization that would cross it. */ if (LABEL_P (insn)) last_label_ruid = reload_combine_ruid; - else if (BARRIER_P (insn)) + else if (BARRIER_P (insn) || BLOCKAGE_P (insn)) for (r = 0; r FIRST_PSEUDO_REGISTER; r++) if (! fixed_regs[r]) reg_state[r].use_index = RELOAD_COMBINE_MAX_USES; BLOCKAGE_P (insn) is used to detect if insn is a blockage insn, is there any available function/macro that implement this functionality?
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #24 from Eric Botcazou ebotcazou at gcc dot gnu.org 2011-09-16 21:24:30 UTC --- It seems postreload.c should be changed to the following to avoid combining --- postreload.c(revision 178904) +++ postreload.c(working copy) @@ -1312,7 +1312,7 @@ reload_combine (void) is and then later disable any optimization that would cross it. */ if (LABEL_P (insn)) last_label_ruid = reload_combine_ruid; - else if (BARRIER_P (insn)) + else if (BARRIER_P (insn) || BLOCKAGE_P (insn)) for (r = 0; r FIRST_PSEUDO_REGISTER; r++) if (! fixed_regs[r]) reg_state[r].use_index = RELOAD_COMBINE_MAX_USES; BLOCKAGE_P (insn) is used to detect if insn is a blockage insn, is there any available function/macro that implement this functionality? volatile_insn_p would seem to be appropriate.
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #21 from Eric Botcazou ebotcazou at gcc dot gnu.org 2011-09-14 06:48:01 UTC --- All callee saved registers should never changed after function call. Here fp has been changed is not because it is after a function call, it is because it is after the target of non local goto. I'm not familiar with the implementation of non local goto, but I guess there is some convention/protocol defines which registers may be changed after the target of a non local goto. That's not the problem. The problem is that the blockage isn't honored.
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #22 from Ramana Radhakrishnan ramana.r at gmail dot com 2011-09-14 20:26:43 UTC --- On 14 Sep 2011, at 07:48, ebotcazou at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #21 from Eric Botcazou ebotcazou at gcc dot gnu.org 2011-09-14 06:48:01 UTC --- All callee saved registers should never changed after function call. Here fp has been changed is not because it is after a function call, it is because it is after the target of non local goto. I'm not familiar with the implementation of non local goto, but I guess there is some convention/protocol defines which registers may be changed after the target of a non local goto. That's not the problem. The problem is that the blockage isn't honored. By ? -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You reported the bug.
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 Carrot carrot at google dot com changed: What|Removed |Added CC||carrot at google dot com --- Comment #19 from Carrot carrot at google dot com 2011-09-13 08:37:52 UTC --- (In reply to comment #15) The machine-dependent reorg pass does something unexpected: (insn 30 18 14 3 (set (reg/f:SI 11 fp) (plus:SI (reg/f:SI 11 fp) (const_int 36 [0x24]))) 4 {*arm_addsi3} (nil)) (insn 14 30 16 3 (use (reg/f:SI 11 fp)) -1 (nil)) (insn 16 14 24 3 (unspec_volatile [ (const_int 0 [0]) ] VUNSPEC_BLOCKAGE) 252 {blockage} (nil)) (insn 24 16 27 3 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -56 [0xffc8])) [6 %sfp+-20 S4 A32])) comp-goto-2.c:26 176 {*arm_movsi_insn} (nil)) is reordered into: (insn 14 18 16 (use (reg/f:SI 11 fp)) -1 (nil)) (insn 16 14 24 (unspec_volatile [ (const_int 0 [0]) ] VUNSPEC_BLOCKAGE) 252 {blockage} (nil)) (insn 24 16 30 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -20 [0xffec])) [6 %sfp+-20 S4 A32])) comp-goto-2.c:26 176 {*arm_movsi_insn} (nil)) (insn 30 24 27 (set (reg/f:SI 11 fp) (plus:SI (reg/f:SI 11 fp) (const_int 36 [0x24]))) 4 {*arm_addsi3} (nil)) despite the blockage. I observed the same transformation, but in postreload pass. My command line is -march=armv7-a -mfloat-abi=soft -Os -w
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #20 from Carrot carrot at google dot com 2011-09-14 03:02:03 UTC --- Instruction 2 and 24 refer to the same location, but have different offset relative to FP because the call to y changes FP. DSE doesn't (and can not, if it is intra-procedural) know that they both refer to the same location and hence thinks insn 2 is dead. It seems to me this (FP having different value after the call) can only happen at postreload. It seems to me that setting wild_read (not just non_frame_wild_read) on all calls after postreload will fix this problem. What's the best way to do that? Will checking for clear_alias_sets != NULL work? All callee saved registers should never changed after function call. Here fp has been changed is not because it is after a function call, it is because it is after the target of non local goto. I'm not familiar with the implementation of non local goto, but I guess there is some convention/protocol defines which registers may be changed after the target of a non local goto.
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #17 from Ramana Radhakrishnan ramana at gcc dot gnu.org 2011-07-18 16:35:22 UTC --- (In reply to comment #16) (In reply to comment #15) The machine-dependent reorg pass does something unexpected: (insn 30 18 14 3 (set (reg/f:SI 11 fp) (plus:SI (reg/f:SI 11 fp) (const_int 36 [0x24]))) 4 {*arm_addsi3} (nil)) (insn 14 30 16 3 (use (reg/f:SI 11 fp)) -1 (nil)) (insn 16 14 24 3 (unspec_volatile [ (const_int 0 [0]) ] VUNSPEC_BLOCKAGE) 252 {blockage} (nil)) (insn 24 16 27 3 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -56 [0xffc8])) [6 %sfp+-20 S4 A32])) comp-goto-2.c:26 176 {*arm_movsi_insn} (nil)) is reordered into: (insn 14 18 16 (use (reg/f:SI 11 fp)) -1 (nil)) (insn 16 14 24 (unspec_volatile [ (const_int 0 [0]) ] VUNSPEC_BLOCKAGE) 252 {blockage} (nil)) (insn 24 16 30 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -20 [0xffec])) [6 %sfp+-20 S4 A32])) comp-goto-2.c:26 176 {*arm_movsi_insn} (nil)) (insn 30 24 27 (set (reg/f:SI 11 fp) (plus:SI (reg/f:SI 11 fp) (const_int 36 [0x24]))) 4 {*arm_addsi3} (nil)) despite the blockage. Hmmm I'm not sure I see this - what's the configure and arch. specific flags you used just in case ? Just in case - I am using -march=armv5t -marm -Os on the command line. Ramana cheers Ramana
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #16 from Ramana Radhakrishnan ramana at gcc dot gnu.org 2011-07-18 16:31:12 UTC --- (In reply to comment #15) The machine-dependent reorg pass does something unexpected: (insn 30 18 14 3 (set (reg/f:SI 11 fp) (plus:SI (reg/f:SI 11 fp) (const_int 36 [0x24]))) 4 {*arm_addsi3} (nil)) (insn 14 30 16 3 (use (reg/f:SI 11 fp)) -1 (nil)) (insn 16 14 24 3 (unspec_volatile [ (const_int 0 [0]) ] VUNSPEC_BLOCKAGE) 252 {blockage} (nil)) (insn 24 16 27 3 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -56 [0xffc8])) [6 %sfp+-20 S4 A32])) comp-goto-2.c:26 176 {*arm_movsi_insn} (nil)) is reordered into: (insn 14 18 16 (use (reg/f:SI 11 fp)) -1 (nil)) (insn 16 14 24 (unspec_volatile [ (const_int 0 [0]) ] VUNSPEC_BLOCKAGE) 252 {blockage} (nil)) (insn 24 16 30 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -20 [0xffec])) [6 %sfp+-20 S4 A32])) comp-goto-2.c:26 176 {*arm_movsi_insn} (nil)) (insn 30 24 27 (set (reg/f:SI 11 fp) (plus:SI (reg/f:SI 11 fp) (const_int 36 [0x24]))) 4 {*arm_addsi3} (nil)) despite the blockage. Hmmm I'm not sure I see this - what's the configure and arch. specific flags you used just in case ? cheers Ramana
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #18 from Eric Botcazou ebotcazou at gcc dot gnu.org 2011-07-18 17:59:04 UTC --- Hmmm I'm not sure I see this - what's the configure and arch. specific flags you used just in case ? Flags are just -Os.
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #15 from Eric Botcazou ebotcazou at gcc dot gnu.org 2011-07-16 21:18:39 UTC --- The machine-dependent reorg pass does something unexpected: (insn 30 18 14 3 (set (reg/f:SI 11 fp) (plus:SI (reg/f:SI 11 fp) (const_int 36 [0x24]))) 4 {*arm_addsi3} (nil)) (insn 14 30 16 3 (use (reg/f:SI 11 fp)) -1 (nil)) (insn 16 14 24 3 (unspec_volatile [ (const_int 0 [0]) ] VUNSPEC_BLOCKAGE) 252 {blockage} (nil)) (insn 24 16 27 3 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -56 [0xffc8])) [6 %sfp+-20 S4 A32])) comp-goto-2.c:26 176 {*arm_movsi_insn} (nil)) is reordered into: (insn 14 18 16 (use (reg/f:SI 11 fp)) -1 (nil)) (insn 16 14 24 (unspec_volatile [ (const_int 0 [0]) ] VUNSPEC_BLOCKAGE) 252 {blockage} (nil)) (insn 24 16 30 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -20 [0xffec])) [6 %sfp+-20 S4 A32])) comp-goto-2.c:26 176 {*arm_movsi_insn} (nil)) (insn 30 24 27 (set (reg/f:SI 11 fp) (plus:SI (reg/f:SI 11 fp) (const_int 36 [0x24]))) 4 {*arm_addsi3} (nil)) despite the blockage.
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #14 from Eric Botcazou ebotcazou at gcc dot gnu.org 2011-07-15 06:26:17 UTC --- Instruction 2 and 24 refer to the same location, but have different offset relative to FP because the call to y changes FP. DSE doesn't (and can not, if it is intra-procedural) know that they both refer to the same location and hence thinks insn 2 is dead. It seems to me this (FP having different value after the call) can only happen at postreload. Hum, this shouldn't happen at all I think. This looks like a latent bug in the implementation of non-local gotos then.
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 Eric Botcazou ebotcazou at gcc dot gnu.org changed: What|Removed |Added Version|unknown |4.7.0 Target Milestone|--- |4.7.0 Summary|comp-goto-2.c regresses in |[4.7 regression] |testing |comp-goto-2.c regresses in ||testing Severity|normal |major
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #12 from Easwaran Raman eraman at google dot com 2011-07-14 17:16:06 UTC --- (In reply to comment #11) I have confirmed that the -Os failures began with r175063 and that the tests pass for several revision before that and pass for several after, so it's unlikely to be an intermittent failure. If it would help I can send dump files for r175063 and the one just before that. It is possible that the second DSE invocation deletes a necessary store. My understanding was that it only acts on spilled stores and all my changes are in the _nospill version, but that seems not to be the case. Could you send me all the RTL dumps with and without this patch as a tar file? That will be very useful in narrowing it down. Thanks, Easwaran
[Bug tree-optimization/49452] [4.7 regression] comp-goto-2.c regresses in testing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49452 --- Comment #13 from Easwaran Raman eraman at google dot com 2011-07-14 22:10:16 UTC --- I looked at the dumps for 920501-7.c and second invocation of DSE removes a necessary store. The relevant dump for function x from 920501-7.c.198r.pro_and_epilogue is below: (insn 2 58 53 2 (set (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -56 [0xffc8])) [6 %sfp+-20 S4 A32]) (reg:SI 0 r0 [ a ])) /scratch/janisjo/arm-linux-fsf/src/gcc-mainline/gcc/testsuite/gcc.reghunt/920501-7.c:12 176 {*arm_movsi_insn} (nil)) ... (call_insn/c/i 11 9 12 2 (parallel [ (call (mem:SI (symbol_ref:SI (y.1271) [flags 0x3] function_decl 0x5578cb80 y) [0 y S4 A32]) (const_int 0 [0])) (use (const_int 0 [0])) (clobber (reg:SI 14 lr)) ]) /scratch/janisjo/arm-linux-fsf/src/gcc-mainline/gcc/testsuite/gcc.reghunt/920501-7.c:20 242 {*call_symbol} (expr_list:REG_NORETURN (const_int 0 [0]) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil))) (expr_list:REG_DEP_TRUE (use (reg:SI 0 r0)) (expr_list:REG_DEP_TRUE (use (reg:SI 12 ip)) (nil ... (insn 24 18 30 3 (set (reg/i:SI 0 r0) (mem/c:SI (plus:SI (reg/f:SI 11 fp) (const_int -20 [0xffec])) [6 %sfp+-20 S4 A32])) /scratch/janisjo/arm-linux-fsf/src/gcc-mainline/gcc/testsuite/gcc.reghunt/920501-7.c:23 176 {*arm_movsi_insn} (nil)) Instruction 2 and 24 refer to the same location, but have different offset relative to FP because the call to y changes FP. DSE doesn't (and can not, if it is intra-procedural) know that they both refer to the same location and hence thinks insn 2 is dead. It seems to me this (FP having different value after the call) can only happen at postreload. It seems to me that setting wild_read (not just non_frame_wild_read) on all calls after postreload will fix this problem. What's the best way to do that? Will checking for clear_alias_sets != NULL work?