[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-15 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Uros Bizjak  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
  Known to work|4.4.6   |
 Resolution||FIXED

--- Comment #22 from Uros Bizjak  2012-01-15 20:44:29 
UTC ---
Fixed for 4.4+.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-15 Thread uros at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #21 from uros at gcc dot gnu.org 2012-01-15 20:38:39 UTC ---
Author: uros
Date: Sun Jan 15 20:38:32 2012
New Revision: 183200

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183200
Log:
PR rtl-optimization/51821
* recog.c (peep2_find_free_register): Determine clobbered registers
from insn pattern.

testsuite/ChangeLog:

PR rtl-optimization/51821
* gcc.dg/pr51821.c: New test.


Added:
branches/gcc-4_4-branch/gcc/testsuite/gcc.dg/pr51821.c
Modified:
branches/gcc-4_4-branch/gcc/ChangeLog
branches/gcc-4_4-branch/gcc/recog.c
branches/gcc-4_4-branch/gcc/testsuite/ChangeLog


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-15 Thread uros at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #20 from uros at gcc dot gnu.org 2012-01-15 20:27:22 UTC ---
Author: uros
Date: Sun Jan 15 20:27:17 2012
New Revision: 183199

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183199
Log:
PR rtl-optimization/51821
* recog.c (peep2_find_free_register): Determine clobbered registers
from insn pattern.

testsuite/ChangeLog:

PR rtl-optimization/51821
* gcc.dg/pr51821.c: New test.


Added:
branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr51821.c
Modified:
branches/gcc-4_5-branch/gcc/ChangeLog
branches/gcc-4_5-branch/gcc/recog.c
branches/gcc-4_5-branch/gcc/testsuite/ChangeLog


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-15 Thread uros at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #19 from uros at gcc dot gnu.org 2012-01-15 19:35:20 UTC ---
Author: uros
Date: Sun Jan 15 19:35:15 2012
New Revision: 183198

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183198
Log:
PR rtl-optimization/51821
* recog.c (peep2_find_free_register): Determine clobbered registers
from insn pattern.

testsuite/ChangeLog:

PR rtl-optimization/51821
* gcc.dg/pr51821.c: New test.


Added:
branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr51821.c
Modified:
branches/gcc-4_6-branch/gcc/ChangeLog
branches/gcc-4_6-branch/gcc/recog.c
branches/gcc-4_6-branch/gcc/testsuite/ChangeLog


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-15 Thread uros at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #18 from uros at gcc dot gnu.org 2012-01-15 18:03:51 UTC ---
Author: uros
Date: Sun Jan 15 18:03:46 2012
New Revision: 183194

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183194
Log:
PR rtl-optimization/51821
* recog.c (peep2_find_free_register): Determine clobbered registers
from insn pattern.

testsuite/ChangeLog:

PR rtl-optimization/51821
* gcc.dg/pr51821.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/pr51821.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/recog.c
trunk/gcc/testsuite/ChangeLog


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-12 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Uros Bizjak  changed:

   What|Removed |Added

URL||http://gcc.gnu.org/ml/gcc-p
   ||atches/2012-01/msg00630.htm
   ||l

--- Comment #17 from Uros Bizjak  2012-01-12 19:18:06 
UTC ---
Patch at [1].

[1] http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00630.html


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-12 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Uros Bizjak  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 AssignedTo|unassigned at gcc dot   |ubizjak at gmail dot com
   |gnu.org |


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-12 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #16 from Uros Bizjak  2012-01-12 17:00:48 
UTC ---
(In reply to comment #15)
> Yes, this seems to be the correct approach.

Patch that fixes the failure:

Index: recog.c
===
--- recog.c (revision 183053)
+++ recog.c (working copy)
@@ -3038,6 +3038,7 @@ peep2_find_free_register (int from, int to, const
   static int search_ofs;
   enum reg_class cl;
   HARD_REG_SET live;
+  df_ref *def_rec;
   int i;

   gcc_assert (from < MAX_INSNS_PER_PEEP2 + 1);
@@ -3051,12 +3052,14 @@ peep2_find_free_register (int from, int to, const

   while (from != to)
 {
-  HARD_REG_SET this_live;
+  gcc_assert (peep2_insn_data[from].insn != NULL_RTX);

+  /* Don't use registers set or clobbered by the insn.  */
+  for (def_rec = DF_INSN_DEFS (peep2_insn_data[from].insn);
+  *def_rec; def_rec++)
+   SET_HARD_REG_BIT (live, DF_REF_REGNO (*def_rec));
+
   from = peep2_buf_position (from + 1);
-  gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
-  REG_SET_TO_HARD_REG_SET (this_live, peep2_insn_data[from].live_before);
-  IOR_HARD_REG_SET (live, this_live);
 }

   cl = (class_str[0] == 'r' ? GENERAL_REGS


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-12 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Eric Botcazou  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
 AssignedTo|ebotcazou at gcc dot|unassigned at gcc dot
   |gnu.org |gnu.org

--- Comment #15 from Eric Botcazou  2012-01-12 
13:20:01 UTC ---
> Please look into "peep2_insn_data[from].live_before" array and
> "peep2_insn_data[from+1].live_before". You will see that the difference is 
> only in ax register. This array is set with df_simulate_one_insn_forwards,
> and this particular function indeed clears set hard registers when REG_UNUSED
> note is found (this is OK for live analysis).

I did and saw exactly that.  The problem is that I missed the REG_UNUSED note
for edx among the 3 other notes (and also that you specifically mentioned it in
the comment #9).  OK, DF appears to be correct here, or at least consistent.

> and since the difference between live registers before/after the insn is only
> ax register, the loop claims others (including dx reg) as "available". The
> calculation, which register is _NOT_CLOBBERED_ by the pattern from "live"
> information is thus not enough, we have to look into the pattern and mark all
> hard registers that are set or clobbered as _NOT_AVAILABLE_ in the register
> set.

Yes, this seems to be the correct approach.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-12 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #14 from Uros Bizjak  2012-01-12 11:42:00 
UTC ---
(In reply to comment #12)

> I'm not sure I understand.  If the peephole matches, then the insn pattern is
> present in the insn stream with instantiated registers, so it's sufficient to
> scan the insn stream.  AFAICS the code doesn't see that edx is live after the
> instruction because DF reports that only eax is; of course it's eax in DImode
> so edx is "implicitly" live, but it's nevertheless live.

A comment to "implicitly" live: my second example (Comment #9) marks ax
register as available, but IT IS SET in the pattern! This supports the theory
outlined in the Comment #13, but df_simulate_one_insn_forwards now marks unused
register ax as available. Again, live analysis is not the correct tool to
determine which registers are clobbered by the insn.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-12 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #13 from Uros Bizjak  2012-01-12 11:34:40 
UTC ---
(In reply to comment #12)

> > Here we need to analyse the insn patterns for ALL sets and clobbers, not 
> > only
> > track live registers through the insn stream.
> 
> I'm not sure I understand.  If the peephole matches, then the insn pattern is
> present in the insn stream with instantiated registers, so it's sufficient to
> scan the insn stream.  AFAICS the code doesn't see that edx is live after the
> instruction because DF reports that only eax is; of course it's eax in DImode
> so edx is "implicitly" live, but it's nevertheless live.

Please look into "peep2_insn_data[from].live_before" array and
"peep2_insn_data[from+1].live_before". You will see that the difference is only
in ax register. This array is set with df_simulate_one_insn_forwards, and this
particular function indeed clears set hard registers when REG_UNUSED note is
found (this is OK for live analysis).

The referred loop processes

(insn 7 19 16 2 (parallel [
(set (reg:DI 0 ax [65])
(ashift:DI (const_int -1 [0x])
(reg:QI 2 cx [orig:63 shift_size ] [63])))
(clobber (reg:CC 17 flags))
]) pr51821.c:8 489 {*ashldi3_doubleword}
 (expr_list:REG_DEAD (reg:QI 2 cx [orig:63 shift_size ] [63])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_UNUSED (reg:SI 1 dx)
(expr_list:REG_EQUAL (ashift:DI (const_int -1
[0x])
(subreg:QI (reg/v:SI 2 cx [orig:63 shift_size ] [63])
0))
(nil))

and since the difference between live registers before/after the insn is only
ax register, the loop claims others (including dx reg) as "available". The
calculation, which register is _NOT_CLOBBERED_ by the pattern from "live"
information is thus not enough, we have to look into the pattern and mark all
hard registers that are set or clobbered as _NOT_AVAILABLE_ in the register
set.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-12 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #12 from Eric Botcazou  2012-01-12 
11:18:37 UTC ---
> I think that DF is OK, the problem is in recog.c/peep2_find_free_register, 
> with
> this loop:
> 
>   while (from != to)
> {
>   HARD_REG_SET this_live;
> 
>   from = peep2_buf_position (from + 1);
>   gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
>   REG_SET_TO_HARD_REG_SET (this_live, peep2_insn_data[from].live_before);
>   IOR_HARD_REG_SET (live, this_live);
> }
> 
> Here we need to analyse the insn patterns for ALL sets and clobbers, not only
> track live registers through the insn stream.

I'm not sure I understand.  If the peephole matches, then the insn pattern is
present in the insn stream with instantiated registers, so it's sufficient to
scan the insn stream.  AFAICS the code doesn't see that edx is live after the
instruction because DF reports that only eax is; of course it's eax in DImode
so edx is "implicitly" live, but it's nevertheless live.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-12 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #11 from Uros Bizjak  2012-01-12 08:04:13 
UTC ---
I think that DF is OK, the problem is in recog.c/peep2_find_free_register, with
this loop:

  while (from != to)
{
  HARD_REG_SET this_live;

  from = peep2_buf_position (from + 1);
  gcc_assert (peep2_insn_data[from].insn != NULL_RTX);
  REG_SET_TO_HARD_REG_SET (this_live, peep2_insn_data[from].live_before);
  IOR_HARD_REG_SET (live, this_live);
}

Here we need to analyse the insn patterns for ALL sets and clobbers, not only
track live registers through the insn stream.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-11 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Eric Botcazou  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #10 from Eric Botcazou  2012-01-12 
07:43:05 UTC ---
Reassigning...


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

--- Comment #9 from Uros Bizjak  2012-01-11 22:54:34 
UTC ---
(In reply to comment #8)
> It looks like target issue after all. I'm preparing a patch with following

OTOH, if the pattern says that it outputs to DImode pair (i.e. %eax/%edx, then
the whole pair is clobbered, no matter if any separate register is later used
or not. Due to this rationale, %edx should not be allocated and the patch from
comment #8 only papers over real problem.

To illustrate this problem further, following test clobbers %eax:

--cut here--
unsigned int  __attribute__((noinline))
test (int shift_size)
{
  unsigned long long res = ~0;

  res <<= shift_size;

  return res >> 32;
}
--cut here--

(insn 7 22 20 2 (parallel [
(set (reg:DI 0 ax [65])
(ashift:DI (const_int -1 [0x])
(reg:QI 2 cx [orig:63 shift_size ] [63])))
(clobber (reg:CC 17 flags))
]) t.c:8 489 {*ashldi3_doubleword}
 (expr_list:REG_DEAD (reg:QI 2 cx [orig:63 shift_size ] [63])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_UNUSED (reg:SI 0 ax)
(expr_list:REG_EQUAL (ashift:DI (const_int -1
[0x])
(subreg:QI (reg/v:SI 2 cx [orig:63 shift_size ] [63])
0))
(nil))

peephole2 pass allocates SImode ax register as scratch:

(insn 31 30 32 2 (parallel [
(set (reg:SI 0 ax)
(const_int 0 [0]))
(clobber (reg:CC 17 flags))
]) t.c:8 -1
 (nil))

(insn 32 31 33 2 (set (reg:CCZ 17 flags)
(compare:CCZ (and:QI (reg:QI 2 cx [orig:63 shift_size ] [63])
(const_int 32 [0x20]))
(const_int 0 [0]))) t.c:8 -1
 (nil))

(insn 33 32 34 2 (set (reg:SI 1 dx [+4 ])
(if_then_else:SI (ne (reg:CCZ 17 flags)
(const_int 0 [0]))
(reg:SI 0 ax [65])
(reg:SI 1 dx [+4 ]))) t.c:8 -1
 (nil))

(insn 34 33 20 2 (set (reg:SI 0 ax [65])
(if_then_else:SI (ne (reg:CCZ 17 flags)
(const_int 0 [0]))
(reg:SI 0 ax)
(reg:SI 0 ax [65]))) t.c:8 -1
 (nil))

I'll speculate that (expr_list:REG_UNUSED (reg:SI 0 ax)) plays critical role
here.  Please compare this with (expr_list:REG_UNUSED (reg:SI 1 dx)) in the
pre-peephole2 pattern in Comment #4.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Uros Bizjak  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW

--- Comment #8 from Uros Bizjak  2012-01-11 22:34:41 
UTC ---
It looks like target issue after all. I'm preparing a patch with following
comment to a peephole2 that calls ix86_split_ashl:

Index: i386.md
===
--- i386.md (revision 183054)
+++ i386.md (working copy)
@@ -8939,11 +8939,17 @@
 ;; values are manipulated, registers are already at a premium.  But if
 ;; we have one handy, we won't turn it away.

+;; ??? Immediate operands should not be allowed for operand 1 predicate.
+;; In case one of high/low registers in register pair is unused, then
+;; nothing prevents peephole2 pass from allocating this unused register
+;; as scratch register.  Output register pair always matches operand 1
+;; input register pair, so register_operand operand 1 predicate is safe.
+
 (define_peephole2
   [(match_scratch:DWIH 3 "r")
(parallel [(set (match_operand: 0 "register_operand" "")
   (ashift:
-(match_operand: 1 "nonmemory_operand" "")
+(match_operand: 1 "register_operand" "")
 (match_operand:QI 2 "nonmemory_operand" "")))
  (clobber (reg:CC FLAGS_REG))])
(match_dup 3)]


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-11 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Eric Botcazou  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 AssignedTo|unassigned at gcc dot   |ebotcazou at gcc dot
   |gnu.org |gnu.org

--- Comment #7 from Eric Botcazou  2012-01-11 
22:03:16 UTC ---
> RTL infrastructure bug, adding some CCs that might find it interesting.

Ugh.  The problem is that the df_simulate_* routines don't correctly handle
hard registers.  Oddly enough, there is an isolated use of hard_regno_nregs in
the df_simulate_one_insn_forwards routine, but with no comment whatsoever.

We need to do something, but the impact of the change isn't very clear.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Uros Bizjak  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot
   ||gnu.org, rsandifo at gcc
   ||dot gnu.org

--- Comment #6 from Uros Bizjak  2012-01-11 14:14:41 
UTC ---
RTL infrastructure bug, adding some CCs that might find it interesting.


[Bug rtl-optimization/51821] [4.5/4.6/4.7 Regression] 64bit > 32bit conversion produces incorrect results with optimizations

2012-01-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51821

Uros Bizjak  changed:

   What|Removed |Added

 CC||ubizjak at gmail dot com
  Component|target  |rtl-optimization

--- Comment #5 from Uros Bizjak  2012-01-11 13:20:53 
UTC ---
rtl_optimization problem with peephole2 pass.