[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Uros Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work|4.4.6 | Resolution||FIXED --- Comment #22 from Uros Bizjak 2012-01-15 20:44:29 UTC --- Fixed for 4.4+.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #21 from uros at gcc dot gnu.org 2012-01-15 20:38:39 UTC --- Author: uros Date: Sun Jan 15 20:38:32 2012 New Revision: 183200 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183200 Log: PR rtl-optimization/51821 * recog.c (peep2_find_free_register): Determine clobbered registers from insn pattern. testsuite/ChangeLog: PR rtl-optimization/51821 * gcc.dg/pr51821.c: New test. Added: branches/gcc-4_4-branch/gcc/testsuite/gcc.dg/pr51821.c Modified: branches/gcc-4_4-branch/gcc/ChangeLog branches/gcc-4_4-branch/gcc/recog.c branches/gcc-4_4-branch/gcc/testsuite/ChangeLog
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #20 from uros at gcc dot gnu.org 2012-01-15 20:27:22 UTC --- Author: uros Date: Sun Jan 15 20:27:17 2012 New Revision: 183199 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183199 Log: PR rtl-optimization/51821 * recog.c (peep2_find_free_register): Determine clobbered registers from insn pattern. testsuite/ChangeLog: PR rtl-optimization/51821 * gcc.dg/pr51821.c: New test. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr51821.c Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/recog.c branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #19 from uros at gcc dot gnu.org 2012-01-15 19:35:20 UTC --- Author: uros Date: Sun Jan 15 19:35:15 2012 New Revision: 183198 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183198 Log: PR rtl-optimization/51821 * recog.c (peep2_find_free_register): Determine clobbered registers from insn pattern. testsuite/ChangeLog: PR rtl-optimization/51821 * gcc.dg/pr51821.c: New test. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr51821.c Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/recog.c branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #18 from uros at gcc dot gnu.org 2012-01-15 18:03:51 UTC --- Author: uros Date: Sun Jan 15 18:03:46 2012 New Revision: 183194 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183194 Log: PR rtl-optimization/51821 * recog.c (peep2_find_free_register): Determine clobbered registers from insn pattern. testsuite/ChangeLog: PR rtl-optimization/51821 * gcc.dg/pr51821.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr51821.c Modified: trunk/gcc/ChangeLog trunk/gcc/recog.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Uros Bizjak changed: What|Removed |Added URL||http://gcc.gnu.org/ml/gcc-p ||atches/2012-01/msg00630.htm ||l --- Comment #17 from Uros Bizjak 2012-01-12 19:18:06 UTC --- Patch at [1]. [1] http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00630.html
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Uros Bizjak changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |ubizjak at gmail dot com |gnu.org |
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #16 from Uros Bizjak 2012-01-12 17:00:48 UTC --- (In reply to comment #15) > Yes, this seems to be the correct approach. Patch that fixes the failure: Index: recog.c === --- recog.c (revision 183053) +++ recog.c (working copy) @@ -3038,6 +3038,7 @@ peep2_find_free_register (int from, int to, const static int search_ofs; enum reg_class cl; HARD_REG_SET live; + df_ref *def_rec; int i; gcc_assert (from < MAX_INSNS_PER_PEEP2 + 1); @@ -3051,12 +3052,14 @@ peep2_find_free_register (int from, int to, const while (from != to) { - HARD_REG_SET this_live; + gcc_assert (peep2_insn_data[from].insn != NULL_RTX); + /* Don't use registers set or clobbered by the insn. */ + for (def_rec = DF_INSN_DEFS (peep2_insn_data[from].insn); + *def_rec; def_rec++) + SET_HARD_REG_BIT (live, DF_REF_REGNO (*def_rec)); + from = peep2_buf_position (from + 1); - gcc_assert (peep2_insn_data[from].insn != NULL_RTX); - REG_SET_TO_HARD_REG_SET (this_live, peep2_insn_data[from].live_before); - IOR_HARD_REG_SET (live, this_live); } cl = (class_str[0] == 'r' ? GENERAL_REGS
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Eric Botcazou changed: What|Removed |Added Status|ASSIGNED|NEW AssignedTo|ebotcazou at gcc dot|unassigned at gcc dot |gnu.org |gnu.org --- Comment #15 from Eric Botcazou 2012-01-12 13:20:01 UTC --- > Please look into "peep2_insn_data[from].live_before" array and > "peep2_insn_data[from+1].live_before". You will see that the difference is > only in ax register. This array is set with df_simulate_one_insn_forwards, > and this particular function indeed clears set hard registers when REG_UNUSED > note is found (this is OK for live analysis). I did and saw exactly that. The problem is that I missed the REG_UNUSED note for edx among the 3 other notes (and also that you specifically mentioned it in the comment #9). OK, DF appears to be correct here, or at least consistent. > and since the difference between live registers before/after the insn is only > ax register, the loop claims others (including dx reg) as "available". The > calculation, which register is _NOT_CLOBBERED_ by the pattern from "live" > information is thus not enough, we have to look into the pattern and mark all > hard registers that are set or clobbered as _NOT_AVAILABLE_ in the register > set. Yes, this seems to be the correct approach.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #14 from Uros Bizjak 2012-01-12 11:42:00 UTC --- (In reply to comment #12) > I'm not sure I understand. If the peephole matches, then the insn pattern is > present in the insn stream with instantiated registers, so it's sufficient to > scan the insn stream. AFAICS the code doesn't see that edx is live after the > instruction because DF reports that only eax is; of course it's eax in DImode > so edx is "implicitly" live, but it's nevertheless live. A comment to "implicitly" live: my second example (Comment #9) marks ax register as available, but IT IS SET in the pattern! This supports the theory outlined in the Comment #13, but df_simulate_one_insn_forwards now marks unused register ax as available. Again, live analysis is not the correct tool to determine which registers are clobbered by the insn.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #13 from Uros Bizjak 2012-01-12 11:34:40 UTC --- (In reply to comment #12) > > Here we need to analyse the insn patterns for ALL sets and clobbers, not > > only > > track live registers through the insn stream. > > I'm not sure I understand. If the peephole matches, then the insn pattern is > present in the insn stream with instantiated registers, so it's sufficient to > scan the insn stream. AFAICS the code doesn't see that edx is live after the > instruction because DF reports that only eax is; of course it's eax in DImode > so edx is "implicitly" live, but it's nevertheless live. Please look into "peep2_insn_data[from].live_before" array and "peep2_insn_data[from+1].live_before". You will see that the difference is only in ax register. This array is set with df_simulate_one_insn_forwards, and this particular function indeed clears set hard registers when REG_UNUSED note is found (this is OK for live analysis). The referred loop processes (insn 7 19 16 2 (parallel [ (set (reg:DI 0 ax [65]) (ashift:DI (const_int -1 [0x]) (reg:QI 2 cx [orig:63 shift_size ] [63]))) (clobber (reg:CC 17 flags)) ]) pr51821.c:8 489 {*ashldi3_doubleword} (expr_list:REG_DEAD (reg:QI 2 cx [orig:63 shift_size ] [63]) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_UNUSED (reg:SI 1 dx) (expr_list:REG_EQUAL (ashift:DI (const_int -1 [0x]) (subreg:QI (reg/v:SI 2 cx [orig:63 shift_size ] [63]) 0)) (nil)) and since the difference between live registers before/after the insn is only ax register, the loop claims others (including dx reg) as "available". The calculation, which register is _NOT_CLOBBERED_ by the pattern from "live" information is thus not enough, we have to look into the pattern and mark all hard registers that are set or clobbered as _NOT_AVAILABLE_ in the register set.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #12 from Eric Botcazou 2012-01-12 11:18:37 UTC --- > I think that DF is OK, the problem is in recog.c/peep2_find_free_register, > with > this loop: > > while (from != to) > { > HARD_REG_SET this_live; > > from = peep2_buf_position (from + 1); > gcc_assert (peep2_insn_data[from].insn != NULL_RTX); > REG_SET_TO_HARD_REG_SET (this_live, peep2_insn_data[from].live_before); > IOR_HARD_REG_SET (live, this_live); > } > > Here we need to analyse the insn patterns for ALL sets and clobbers, not only > track live registers through the insn stream. I'm not sure I understand. If the peephole matches, then the insn pattern is present in the insn stream with instantiated registers, so it's sufficient to scan the insn stream. AFAICS the code doesn't see that edx is live after the instruction because DF reports that only eax is; of course it's eax in DImode so edx is "implicitly" live, but it's nevertheless live.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #11 from Uros Bizjak 2012-01-12 08:04:13 UTC --- I think that DF is OK, the problem is in recog.c/peep2_find_free_register, with this loop: while (from != to) { HARD_REG_SET this_live; from = peep2_buf_position (from + 1); gcc_assert (peep2_insn_data[from].insn != NULL_RTX); REG_SET_TO_HARD_REG_SET (this_live, peep2_insn_data[from].live_before); IOR_HARD_REG_SET (live, this_live); } Here we need to analyse the insn patterns for ALL sets and clobbers, not only track live registers through the insn stream.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Eric Botcazou changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #10 from Eric Botcazou 2012-01-12 07:43:05 UTC --- Reassigning...
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 --- Comment #9 from Uros Bizjak 2012-01-11 22:54:34 UTC --- (In reply to comment #8) > It looks like target issue after all. I'm preparing a patch with following OTOH, if the pattern says that it outputs to DImode pair (i.e. %eax/%edx, then the whole pair is clobbered, no matter if any separate register is later used or not. Due to this rationale, %edx should not be allocated and the patch from comment #8 only papers over real problem. To illustrate this problem further, following test clobbers %eax: --cut here-- unsigned int __attribute__((noinline)) test (int shift_size) { unsigned long long res = ~0; res <<= shift_size; return res >> 32; } --cut here-- (insn 7 22 20 2 (parallel [ (set (reg:DI 0 ax [65]) (ashift:DI (const_int -1 [0x]) (reg:QI 2 cx [orig:63 shift_size ] [63]))) (clobber (reg:CC 17 flags)) ]) t.c:8 489 {*ashldi3_doubleword} (expr_list:REG_DEAD (reg:QI 2 cx [orig:63 shift_size ] [63]) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_UNUSED (reg:SI 0 ax) (expr_list:REG_EQUAL (ashift:DI (const_int -1 [0x]) (subreg:QI (reg/v:SI 2 cx [orig:63 shift_size ] [63]) 0)) (nil)) peephole2 pass allocates SImode ax register as scratch: (insn 31 30 32 2 (parallel [ (set (reg:SI 0 ax) (const_int 0 [0])) (clobber (reg:CC 17 flags)) ]) t.c:8 -1 (nil)) (insn 32 31 33 2 (set (reg:CCZ 17 flags) (compare:CCZ (and:QI (reg:QI 2 cx [orig:63 shift_size ] [63]) (const_int 32 [0x20])) (const_int 0 [0]))) t.c:8 -1 (nil)) (insn 33 32 34 2 (set (reg:SI 1 dx [+4 ]) (if_then_else:SI (ne (reg:CCZ 17 flags) (const_int 0 [0])) (reg:SI 0 ax [65]) (reg:SI 1 dx [+4 ]))) t.c:8 -1 (nil)) (insn 34 33 20 2 (set (reg:SI 0 ax [65]) (if_then_else:SI (ne (reg:CCZ 17 flags) (const_int 0 [0])) (reg:SI 0 ax) (reg:SI 0 ax [65]))) t.c:8 -1 (nil)) I'll speculate that (expr_list:REG_UNUSED (reg:SI 0 ax)) plays critical role here. Please compare this with (expr_list:REG_UNUSED (reg:SI 1 dx)) in the pre-peephole2 pattern in Comment #4.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Uros Bizjak changed: What|Removed |Added Status|ASSIGNED|NEW --- Comment #8 from Uros Bizjak 2012-01-11 22:34:41 UTC --- It looks like target issue after all. I'm preparing a patch with following comment to a peephole2 that calls ix86_split_ashl: Index: i386.md === --- i386.md (revision 183054) +++ i386.md (working copy) @@ -8939,11 +8939,17 @@ ;; values are manipulated, registers are already at a premium. But if ;; we have one handy, we won't turn it away. +;; ??? Immediate operands should not be allowed for operand 1 predicate. +;; In case one of high/low registers in register pair is unused, then +;; nothing prevents peephole2 pass from allocating this unused register +;; as scratch register. Output register pair always matches operand 1 +;; input register pair, so register_operand operand 1 predicate is safe. + (define_peephole2 [(match_scratch:DWIH 3 "r") (parallel [(set (match_operand: 0 "register_operand" "") (ashift: -(match_operand: 1 "nonmemory_operand" "") +(match_operand: 1 "register_operand" "") (match_operand:QI 2 "nonmemory_operand" ""))) (clobber (reg:CC FLAGS_REG))]) (match_dup 3)]
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Eric Botcazou changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |ebotcazou at gcc dot |gnu.org |gnu.org --- Comment #7 from Eric Botcazou 2012-01-11 22:03:16 UTC --- > RTL infrastructure bug, adding some CCs that might find it interesting. Ugh. The problem is that the df_simulate_* routines don't correctly handle hard registers. Oddly enough, there is an isolated use of hard_regno_nregs in the df_simulate_one_insn_forwards routine, but with no comment whatsoever. We need to do something, but the impact of the change isn't very clear.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Uros Bizjak changed: What|Removed |Added CC||ebotcazou at gcc dot ||gnu.org, rsandifo at gcc ||dot gnu.org --- Comment #6 from Uros Bizjak 2012-01-11 14:14:41 UTC --- RTL infrastructure bug, adding some CCs that might find it interesting.
[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821 Uros Bizjak changed: What|Removed |Added CC||ubizjak at gmail dot com Component|target |rtl-optimization --- Comment #5 from Uros Bizjak 2012-01-11 13:20:53 UTC --- rtl_optimization problem with peephole2 pass.